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Preface 


Combinatorics belongs to those areas of mathematics having experienced a most 
impressive growth in recent years. This growth has been fuelled in large part by the 
increasing importance of computers, the needs of computer science and demands 
from applications where discrete models play more and more important roles. But 
also more classical branches of mathematics have come to recognize that combi- 
natorial structures are essential components of many mathematical theories. 

Despite the dynamic state of this development, we feel that the time is ripe 
for summarizing the current status of the field and for surveying those major re- 
sults that in our opinion will be of long-term importance. We approached leading 
experts in all areas of combinatorics to write chapters for this Handbook. The 
response was overwhelmingly enthusiastic and the result is what you see here. 

The intention of the Handbook is to provide the working mathematician or com- 
puter scientist with a good overview of basic methods and paradigms, as well as 
important results and current issues and trends across the broad spectrum of com- 
binatorics. However, our hope is thal even specialists can benefit from reading this 
Handbook, by learning a leading expert’s coherent and individual view of the topic. 

As the reader will notice by looking at the table of contents, we have structured 
the Handbook into five sections: Structures, Aspects, Methods, Applications, and 
Horizons. We feel that viewing the whole ficld from different perspectives and 
taking different cross-sections will help to understand the underlying framework 
of the subject and to see the interrelationships more clearly. As a consequence of 
this approach, a number of the fundamental results occur in more than one chap- 
ter. We believe that this is an asset rather than a shortcoming, since it illustrates 
different viewpoints and interpretations of the results. 

We thank the authors not only for writing the chapters but also for many helpful 
suggestions on the organization of the book and the presentation of the material. 
Many colleagues have contributed to the Handbook by reading the initial versions 
of the chapters and by making proposals with respect to the inclusion of topics und 
results as well as the structuring of the chapters. We are grateful for the significant 
help we received. 

Even though this Handbook is quite voluminous, it was inevitable that some 
areas of combinatorics had to be left out or were not covered in the depth they 
deserved. Nevertheless, we believe that the Handbook of Combinatorics presents 
a comprehensive and accessible view of the present state of the field and that it 
will prove to be of lasting value. 


Ronald Graham 
Martin Grdétschel 
Laszlo L.ovasz 
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1. Introduction 


In all areas of mathematics, questions arise of the form “Given a description of a 
finite set, what is its cardinality?” Enumerative combinatorics deals with questions 
of this sort in which the sets to be counted have a fairly simple structure, and 
come in indexed families, where the index set is most-often the set of nonnegative 
integers. The two branches of enumerative combinatorics discussed in this book are 
asymptotic enumeration and algebraic enumeration. In asymptotic enumeration, 
the basic goal is an approximate but simple formula which describes the order of 
growth of the cardinalities as a function of their parameters. Algebraic enumeration 
deals with exact results, either explicit formulas for the numbers in question, or 
more often, generating functions or recurrences from which the numbers can be 
computed. 

The two fundamental tools in enumeration are bijections and generating func- 
tions, which we introduce in the next two sections. If there is a simple formula for 
the cardinality of a set, we would like to find a “reason” for the existence of such 
a formula. For example, if a set S has cardinality 2", we may hope to prove this 
by finding a bijection between S and the set of subsets of an n-element set. The 
method of generating functions has a long history, but has often been regarded 
as an ad hoc device. One of the main themes of this article is to explain how 
generating functions arise naturally in enumeration problems. 

Further information and references on the topics discussed here may be found in 
the books of Comtet (1974), Goulden and Jackson (1983), Riordan (1958), Stanley 
(1986), and Stanton and White (1986). 


2. Bijections 


The method of bijections is really nothing more than the definition of cardi- 
nality: two sets have the same number of elements if there is a bijection from 
one to the other. Thus if we find a bijection between two sets, we have a proof 
that their cardinalities are equal; and conversely, if we know that two sets have 
the same cardinality, we may hope to find an explanation in the existence of 
an easily describable bijection between them. For example, it is very easy to 
construct bijections between the following three sets: the set of 0-1 sequences 
of length n, the set of subsets of [n] = {1,2,...,n}, and the set of composi- 
tions of n+1. (A composition of an integer is an expression of that integer as 
sum of positive integers. For example, the compositions of 3 are 1+1+1, 1+2, 
241, and 3.) The composition a; + a2 +---+a, of n+1 corresponds to the sub- 
set S = {a),@, + dz,...,a, +a2 +--+» +a,x..,} of {1,2,...,a} and to the 0-1 sequence 
U,U2°--U, in which u; = 1 if and only if i € S. Moreover, in our example, a com- 
position with & parts corresponds to a subset of cardinality k — | ones, and to a 
0-1 sequence with k — 1 ones, and thus there are (,",) of each of these. 

It is easy to give a bijective proof that the set of compositions of n with parts 1 
and 2 is equinumerous with the set of compositions of n +2 with all parts at least 
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2: given a composition a, + a2 +--- +a, of n+2 with all a; > 2, we replace each a; 
with 
2+1+---4+1 
— 
a,--2 


and then we remove the initial 2. If we let f, be the number of compositions of n 
with parts 1 and 2, then f,, is easily seen to satisfy the recurrence f, = f,-1 + fn—2 
for n >2, with the initial conditions fo =1 and fj =1. Thus f,, is a Fibonacci 
number. (The Fibonacci numbers are usually normalized by Fy = 0 and F, = 1, so 
fn > Fist ) 

As another example, if 7 is a permutation of [n], then we can express 7 as a 
product of cycles, where each cycle is of the form (i (i) wi) --- mi). 
We can also express 7 as the linear arrangement of {n], w(1) a(2)--- a(n). Thus 
the set of cycles {(1 4), (2), (3 5)} corresponds to the linear arrangement 42513. 
So we have a bijection between sets of cycles and linear arrangements. 

This simple bijection turns out to be useful. We use it to give a proof, due 
to Joyal (1981, p. 16), of Cayley’s formula for labeled trees. First note that the 
bijection implies that for any finite set S the number of sets of cycles of elements 
of S (each element appearing exactly once in some cycle) is equal to the number 
of linear arrangements of elements of S. 

The number of functions from {n] to {n] is clearly n". To each such function f we 
may associate its functional digraph which has an arc from i to f(i) for each i in [n]. 
Now every weakly connected component of a functional digraph (i.e., connected 
component of the underlying undirected graph) can be represented by a cycle of 
rooted trees. So by the correspondence just given, n” is also the number of linear 
arrangements of rooted trees on [n]. We claim now that n”" = n?z,, where ¢, is the 
number of trees on {n}. 

It is clear that n74, is the number of triples (x,y, 7), where x,y € [n] and T is 
a tree on [n]. Given such a triple, we obtain a linear arrangement of rooted trees 
by removing all arcs on the unique path from x to y and taking the nodes on this 
path to be the roots of the trees that remain. This correspondence is bijective, and 
thus ¢, =n"~?. : 

Priifer (1918) gave a different bijection for Cayley’s formula, which is easier 
to describe but harder to justify. Given a labeled tree on {n], let i; be the least 
leaf (node of degree 1), and suppose that i, is adjacent to j,. Now remove i, 
from the tree and let iz be the least leaf of the new tree, and suppose that in 
is adjacent to jf). Repeat this procedure until only two nodes are left. Then the 
original tree is uniquely determined by j, ---j,-2 and conversely any sequence 
hice: in-2 of elements of [n] is obtained from some tree. Thus the number of trees 
isn". 

Both proofs of Cayley’s formula sketched above can be refined to count trees ac- 
cording to the number of nodes of each degree, and thereby to prove the Lagrange 
inversion formula, which we shall discuss in section 6. (See Labelle 1981.) 

There is another useful bijection between sets of cycles and linear arrangements 
which we shall call Foata’s transformation (sce, e.g., Foata 1983) that has interesting 
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properties. Given a permutation in cycle notation, we write each cycle with its 
least element first, and then we arrange the cycles in decreasing order by their 
least elements. Thus in our example above, we would have 7 = (35)(2)(14). Then 
we remove the parentheses to obtain a new permutation whose 1-line notation is 
& = 35214, 

If o is a permutation of {n], then a /eft-right minimum (or lower record) of o 
is an index i such that o(i) < o(j) for all j <i. It is clear that i is a left-right 
minimum of 7 if and only if 7(é) is the least element in its cycle in 7. Thus we 
have the following. 


Theorem 2.1. The number of permutations of [n| with k left-right minima is equal 
to the number of permutations of [n] with k cycles. 


This number is (up to sign) a Stirling number of the first kind. We shall see them 
again in sections 3 and 9. 

In section 10 we shall need a variant of Foata’s transformation in which left-right 
maxima are used instead of left-right minima. 


3. Generating functions 


The basic idea of generating functions is the following: instead of finding the car- 
dinality of a set S, we assign to each @ in S a weight w(a). Then the generating 
function G(S) for S (with respect to the weighting function w) is )>,<5 w(@). Thus 
the concept of generating function for a set is a generalization of the concept of 
cardinality. Note that S may be infinite as long as the sum converges (often as a 
formal power series). 

The weights may be elements of any abelian group, but they are usually mono- 
mials in a ring of polynomials or power series. In a typical application each element 
a of S will have a “length” /(a) and we take the weight of a to be x“), where x is 
an indeterminate. Then knowing the generating function )~,... x!) is equivalent 
to knowing the number of elements of S of each length. 

Analogous to the product rule for cardinalities, |A||B| = |A x B], is the product 
tule for generating functions, G(A)G(B) = G(A x B), where we take the “product 
weight” on A x B, defined by w((a, B)) = w(a)w(B). 

As an example, suppose we want to count sequences of zeros and ones of length 
n according to the number of zeros they contain. We can identify the set of 0-1 
sequences of length n with the Cartesian product {0,1}". If we weight {0,1} by 
w(0) =x and w(1) = y, then the product weight on {0,1}” assigns to a sequence 
with j zeros and k (=n — j) ones the weight x/y*. Thus 


(x+y = SO (\x" 


jtk-n } 


a 


is the generating function for sequences of zeros and ones of length n by the 
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number of zeros and the number of ones. If we want to count sequences of zeros 
and ones of all lengths with this weighting, we sum on 7 to obtain the generating 


function 
Cc . 
x Gas Toy 
jaco’ I amet 


Now suppose we want to count compositions with parts 1 and 2. Rather than 
picking an integer m and considering the compositions of n, we pick an integer 
k and consider the set C, of all compositions of any integer with exactly k parts 
(each part being 1 or 2). We may identify C, with {1,2}*. If we assign 1 the weight 
x and 2 the weight x”, where x is an indeterminate, then the product weight of a 
composition of n in Cy is x". Thus 


2k 


G(Cy) = GEL, 2}4) = GL, 2p) = (x4) = > ( k ev 


n—k 
n=k 


Thus there are Oats. compositions of 1 with k parts, each part 1 or 2. As before, if 
we do not care about the number of parts, we sum on k to obtain S77. g(x +.x7)* = 
(1 — x —x*)~' as the generating function for all compositions into parts 1 and 2. By 
the same kind of reasoning, if A is any set of positive integers, then the generating 
function for compositions with k parts, all in A, is (Slice, xy and the generating 
function for compositions with any number of parts, all in A, is (1 ~ 3°, 4 xy, 
In particular, if A is the set of positive integers then 5>,.4x' = x/(1— x) so the 
generating function for compositions with k parts is 


GS) EG) 


n=k 


for k > 0 and the generating function for all compositions is 


| foes 
x 1~x x nhcn 
(1 3) ae eae ee x" 


The generating function for compositions with parts greater than 1 is 


pene EO: ca ee Laser, i 
1—x ~ bax x? tox x? eR 


nod 


Similarly, the generating function for compositions with odd parts is 


1 oo 
x x 
fis _ =t1+t = Fx". 
( a) b-x-x? Do 


n= 


Thus we have proved using generating functions two of the results we proved using 


Pee 
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bijections in the preceding section. Notice that the generating functions take into 
account initial cases which did not arise in the bijective approach. 

For our next two examples, we consider another bijection for permutations. Sup- 
pose that a is a permutation of [|]. We. associate to ma sequence a,a2---a, of 
integers satisfying 0 < a; < j — 1 for each j as follows: a; is the number of indices 
i<j for which w(i) > a(j). The sequence a,az---a, is called the inversion table 
of wm: an inversion of a is a pair (i,j) with i <j and w(i) > w(j), and thus a; is 
the number of inversions of a of the form (i,j). It is not difficult to show that the 
correspondence between permutations and their inversion tables gives a bijection 
between the set ¥, of permutations of [7] and the set T,, of sequences a,ap-:- a, 
of integers satisfying 0 <a; < j — 1 for each j. Note that 7), is the Cartesian prod- 
uct {0} x {0,1} x --- x {0,1,...,7— 1}. We shall use the inversion table to count 
permutations by inversions and also by cycles. 

Let (7) be the number of inversions of 7. We would like to find the generating 
function G(¥,,) for permutations of [1] where each permutation 7r is assigned the 
weight q!). To do this we note that /(77) is the sum of the entries of the inversion 
table of 7, and thus if we assign the weight w(a) = q"'*"* to a = a,a)°--@n € Tr, 
then we have 


G(SF,) = G(T,) =1-(1+q)---U+q+---+q" 'y, 


Next we count permutations by left-right minima. It is clear that j is a left- 
right minimum of 7 if and only if a; = j — 1. Thus if we assign the weight t* to 
a permutation in ¥, with k left-right minima, and to a sequence in 7, with k 
occurrences of a; = j — 1, then we have 


(Fn) = GTn) = t(t +1)(t+.2)---(f4n—t= Sela ke, 


k-0 


where c(#,k) is (by definition) the unsigned Stirling number of the first kind. By 
Theorem 2.1, it follows that c(m,k) is also the number of permutations in Y, with 
k cycles. 


4, Free monoids 


Free monoids provide a useful way of organizing many simple applications of 
generating functions. Let A be a set of “letters”. The free monoid A* is the set of 
all finite sequences (including the empty sequence) of elements of A, usually called 
words, with the operation of concatenation. We can construct an algebra from A* 
by taking formal sums of elements of A* with coefficients in some ring. We write 
1 for the empty sequence, which is the unit of this algebra. These formal sums 
are then formal power series in noncommuting variables. The generating function 
G(S) for a subset S of A* is the sum of its elements. 


1028 1M. Gessel and R.P. Stanley 


If S and T are subsets of A’, we write ST for the set {st | s¢S andte T}. 
We say that the product ST is unique if every element of ST has only one such 
factorization. The fundamental fact about generating functions is that if the product 
ST is unique, then @(ST) = 4(S)G(T). , 

More generally, we may define a free monoid to be a set together with an 
associative binary operation which is isomorphic to a free monoid as defined above. 
Let A* = A* \ {1} and suppose that S is a subset of A* such that for each k, every 
element of S* has a unique factorization 515 ---s, with each s; in S. Such a set S is 
sometimes called a uniquely decodable code, or simply a code. Then S* = 72S! 
is a free monoid. We call the elements of S the primes of the free monoid S*. In 
this case 


G(S*') =) G(S)i = (1 G(S)) 


i-0 


In particular, $(A*) = (1 — 9(A)) 

Among the simplest free monoid problems are those dealing with compositions 
of integers, as we saw in the previous section. A composition of an integer is simply 
an element of the free monoid P*, where P is the set of positive integers. 

As a more interesting example, let A = {X,Y}, let S be the subset of A’ 
consisting of words with equal numbers of X’s and Y’s, and let T be the sub- 
set of A* of words with no nonempty initial segment in S. Then A* =ST 
uniquely, so G(A*) = (1 - X — Y)"' = G(S)G(T). Moreover, S is a free monoid 
U*, where U is the set of words in § which cannot be factored nontrivially 
in S. The sets S, 7, and U have simple interpretations in terms of walks in 
the plane, starting at the origin. If X and Y are represented by unit steps 
in the x and y directions, then S corresponds to walks which end on the 
main diagonal, 7 corresponds to walks that never return to the main diago- 
nal, and U corresponds to walks that return to the main diagonal only at the 
end. 

It is often useful to replace the noncommuting variables by commuting variables. 
If we replace the letter X by the variable x, we are assigning X the weight x. (More 
formally, we are applying a homomorphism in which the image of X is x.) 

In our example, if we weight X and Y by commuting variables x and y, then 
G(A*) becomes 1/(1—x—y) and G(S) becomes S729 C”)x"y" = (1 —4xy) 1? 
since there are ee) ways of arranging m X’s and 2 Y’s. Thus 4(7) becomes 
J1—4xy/(1 —x —y). It can be shown that this is equal to 


py mma (ieee (4.1) 


where the constant term is 1. The coefficients in (4.1) are called ballot numbers 
and we shall see them again in section 6. 
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If we replace x and y by the same variable z, the generating function for T 
becomes ; 


Vi-42?,_ 142z 
1-220 47? 


Thus the number of words in 7 of length 2n is (2”) and the number of words in T 
of length 2n +1 is 2(7”). 

Although we usually work with formal power series, it is sometimes useful for 
variables to take on real values. We derive an inequality called McMillan’s in- 
equality which is useful in information theory. (See McMillan 1956.) Let A be an 
alphabet (set of letters) of size r, and let S be a code in A*, so that S* is a free 
monoid. Let us weight each letter of A by ¢, and let 4(S) = p(t). 


We know that G(S*) = (1 — p(t)! as formal power series in ¢. Since there 
are r* words in A* of length k, the coefficient of r* in (1 ~p() is at most 
rk. If 0<a<1/r then the series 377°.)r*a* converges absolutely to (1 — ra)"', 
and thus (1 — p(a)) | <(1—ra)', which implies p(a) < ra. Taking the limit as 
a approaches 1/r from below, we obtain p(1/r) <1. 

Thus we have proved the following. 


Theorem 4.2. Let S be a uniquely decodable code in an alphabet of size r, and for 
each k let py, be the number of words in S of length k. Then 7°, pyr~* <1. 


In some applications of free monoids, the “letters” have some internal struc- 
ture. For example, consider the set of permutations 7 of [n] = {1,2,...n} satisfy- 
ing |7(i) —i| <1. We can represent a permutation of [n| as a digraph with node 
set [n] with an arc from i to (i) for each i. If we draw the digraph with the 
nodes in increasing order, we get a picture like this one, which corresponds to the 
permutation 2143576: 


ce c> O<» 


It is clear that these permutations form a free monoid with the two “letters”, or 


primes 
G) ~ 


Thus the generating function (by length) for these permutations is (1 — x — x?)~!, 


and there is an obvious bijection between these permutations and compositions 
with parts 1 and 2. 


1030 LM. Gessel and R.P. Stanley 


Sometimes it is easier to count all the elements of a free monoid than just the 
primes. If we represent arbitrary permutations as in the previous example, then 
we have a free monoid in which the primes, called indecomposable permutations, 
are those permutations a of (n| (for some n) such that for 1 <i <n, @ restricted 
to [i] is not a permutation of [i]. For example, 42 13 is indecomposable: 


but 21534 is not: 
Thus if g(x) is the generating function for indecomposable permutations, we have 


Sontx" (1 —g(x))", 


so 


as shown by Comtet (1972). 


5. Circular words 


We now study some properties of words in which the letters are thought of as 
arranged in a circle, so that the last letter is considered to be followed by the first. 
(This should not be confused with the problem of counting equivalence classes of 
words under cyclic permutation, which we discuss in section 14.) 

We define the cyclic shift operator C on words by 


Cayaz--- ag = 4204+ aay. 


A conjugate or cyclic permutation of a word w is a word of the form Cw for some 


m. If § is a set of words, then we define S° to be the set of all conjugates of words 
in S. 
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Suppose that S* is a free submonoid of the free monoid A*, and let w = s;s2--- Ss, 
be an element of S*, where each s; is in S. It is clear that C'w € S* whenever i 
takes on any of the k values 0, /(s;), [(s152), ..., [(5152-++5,_1), where /(v) denotes 
the length of the word v. If these are the only values of i, with 0 <i < J(w), for 
which C‘w € S*, then we call S* cyclically free'. For example, {ab, b}* is cyclically 
free, but {aa}* is not. 

If S* is cyclically free, then it is clear that for w € (S*)° there are exactly k values 
of i, with 0 <i < I(w), for which C'w € S*. 


Theorem 5.1. Suppose that S* is cyclically free and let Q = S* N\A". Then k\Q°\ = 
n|Q|. 


Proof. We count pairs (i,w), where C'w € Q and 0 <i <n. First we may choose 
C'w in |Q| ways. Then w is determined by i, which may be chosen arbitrarily in 
{0,1,---, — 1}. Thus there are n|Q| pairs. On the other hand, we may choose w 
first as an arbitrary element of Q° and by the remark above, there are k choices 
fori. O 


In the next section we shall use a weighted version of Theorem 5.1 which is 
proved exactly the same way. 


From Theorem 5.1 we can easily derive a generating function for (S*)°: 


Corollary 5.2. Suppose that S* is cyclically free and let g(z) = Dyes z!™. Then 


oO zn 
Sty nA" |ck= = log ————. 
pal ) es eer ey 


Equivalently, 


= tzg'(z) 
S¥Y MAK Z" = 
x : 1 ao tg(z) 


We can use Theorem 5.1 to count the number of k-subsets of [n] with no two 
consecutive elements, where 1 and n are considered consecutive. We take S = 
{ab, b}. The subsets we want correspond to words in (S"~* n {a, b}")°. These words 
contain a letters, of which n — k are b’s, and hence k are a’s. The positions of the 
a’s in one of these words determines the subset. S* is clearly cyclically free, so by 
Theorem 5.1, the number of such subsets is (n/(n — k))(";*). 

Our next example will be useful in proving the Lagrange inversion formula in 
the next section. Let @ be any function from A to the real numbers. Extend ¢ to 
all of A* by defining $(a,a2---a,) = @(a;)+---+ $(a,). Define R by 


R= {w | if w =wv with uF 1 then (uz) < 0}. (5.3) 


' In the theory of codes, S is calted a circular code and S° is called a very pure free monoid. See, ¢.g., 
Berstel and Perrin (1983). 
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It is easily verified that R is a cyclically free submonoid of A’. 
The following description of R° is the key step in our proof of the Lagrange 
inversion formula in the next section. 


Lemma 5.4. Let R and $ be as above. Then R° = {\}U{w | d(w) < 0}. 


Proof. We need only show that if ¢(w) <0, then for some i, C'w eR. OF the 
heads (initial segments) A of w which maximize ¢$(h), let u be the longest, and let 
w = uv. Then vu is easily verified to be in R. O 


6. Lagrange inversion 


In the last example, let A = {x_1,X0,%1,%2,..-} and define @ : A* > Z by 
(Xi, +X), = tet tim. 


Let R be as in (5.3) and let S={w | we R and $(w) = —1.}. We claim that 
R = S*. Since we know that R is a free monoid, we need only show that if w is a 
prime of R, then ¢(w) = —1. 

To see this, let w be a prime of R. Since ¢(w) <0 and #(x;) > —1 for each 
x; € A, w must have a head A with ¢(h) = —1. Let w = uv, where u is the longest 
head of w for which ¢(u) = —1. Then v must be in R, since otherwise w would 
~ have a longer head A with (A) > —1. Since w is a prime of R, this means w = u. 
It follows that if v is any word in R with @(v) = —k thenu € S¥. 

Now let v be any word in S and suppose v = ux;. Then u is in R with @(u) +i = 
—1,so0 #(u) = —1 —i, and thus u € S'*!, It follows that 


Oo 


s=U sis, (6.1) 


i=-1 


where the union is disjoint. We are now ready to prove the Lagrange inversion 
formula. We use the notation [x”|F (x) to denote the coefficient of x” in F(x). 


Theorem 6.2. Let g(u) = S079 8a", where the g, are indeterminates. Then there is 
a unique formal power series f in the g,, satisfying f = g(f), and for k > 0, 


co 


= Su et (63) 


n=1 


Proof. It is easily seen that the equation f = g(f) has a unique solution. Let us 
assign to the letter x; the weight g;,; and let f be the image of G(S) under this 
assignment. Then trom (6.1) we have 


f= >of gin = 2). 


i=-1 
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By the weighted version of Theorem 5.1, the sum of the weights of the words of 
length 7 in S* is k/n times the sum of the weights of the words in (S* 9 A”)°. But 
by Lemma 5.4, the sum of the weights of the words in (S* 9 A")? is 


wh (EO) =p eta. 


The proof we have just given is essentially that of Raney (1960). It is clear that if 
the g,; are assigned values that are not necessarily indeterminates, then the theorem 
still holds as long as the sum in (6.3) converges as a formal power series and f is 
uniqucly determined as a formal power series by f = g(f). The usual formulation 
of Lagrange inversion is obtained by taking g(u) = z)~p-g rau", where z is an 
indeterminate and the r, are arbitrary. 

One of the most important applications of Lagrange inversion is to the enumer- 
ation of ordered trees. (An ordered tree is a rooted unlabeled tree in which the 
children of any node are linearly ordered.) Let us weight a node with i children 
in an ordered tree by g;, and weight the tree by the product of the weights of its 
nodes. If f is the sum of the weights of all ordered trees, then, since an ordered 
tree consists of a root together with some number (possibly zero) of children, each 
of which may be an arbitrary ordered tree, we have 


f=) ef S20): 
i=0 


where g(u) = >7%2y g;u'. The Lagrange inversion formula then yields the following. 


Theorem 6.4. The number of k-tuples of ordered trees in which a total of n; nodes 
have i children is 


*( " ) where n = 3°, Ni, 


No, 11,N2,°°- 
ifn, +2n2 +3n34+---=n—k, and 0 otherwise. 


{t is not hard to derive Theorem 6.2 from Theorem 6.4, so any other proof of 
Theorem 6.4 (for example, by induction), yields a proof of the Lagrange inversion 
formula. Our approach can also be used to give a purely combinatorial proof of 
Theorem 6.4 without the use of generating functions. 

A few special cases of Theorems 6.2 and 6.4 are especially important. If there 
are a nodes with 2 children, b nodes with no children, and no other nodes, then 
with b =a+k the number of k-tuples of such trees is 


k n\ ek 2a+k 
n\a)” 2a+k\ a J 


These numbers are called ballot numbers. The special case k = | gives the Cutalan 
numbers 
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To apply Theorem 6.2 directly to this case, we may take gy = 1, go = x, and g; =0 
for i £ 0,2. Then f satisfies f = 1+xf?, so f = (1 — V1 — 4x)/2x, and we obtain 


1- vi 4x =; k _(2a+k) 
2x “~2atk\ a . 


To count all ordered trees we set g; =x for all i, to obtain the equation f(x) = 
x/(1— f(x)), with the solution 


forge ae UR 
f(x = (*) 


Sk (2n-k-1\ , Ck 
=r n—k \s ge Sores: 


n-@ 


@ + Be 
n 


so we again obtain the Catalan and ballot numbers. It is an instructive exercise 


to find a bijection between these classes of trees, and to relate these results to 
formula (4.1). 


Our analysis gives a well-known bijection between ordered trees and words in 
S. The code c(t) for a tree T may be defined as follows: If the root of T has no 
children, then c(7) = x_,. Otherwise, if the children of the root of 7 are (in order) 
the roots of trees 7;, T>, ..., T,, then 


C(T) = c(T))---C( Ty) x41. 


For another example, we define a binary tree to be a rooted tree in which every 
node has a left child, a right child, neither, or both. Thus 


Pn Or 


are different binary trees. Let us weight a binary tree with n nodes, i left children, 


and j right children by x"L'R’. Then if f is the generating function for these trees, 
we have 


f=xQ+Lf\+Rf), 
and thus by Lagrange inversion we have 
ft as > =(;) ( n JER. 
par eee iJ\itk 


For k = 1, the numbers (1 /n)(7)(,",) are called Runyon numbers or Narayana num- 
bers. 


we 
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7. The transfer matrix method 


Many enumeration problems can be transformed into problems of counting walks 
in digraphs, which can be solved by the transfer matrix method. Suppose D is a 
finite digraph. To every arc of D we associate a weight. Let M be the matrix in - 
which rows and columns are indexcd by the nodes of D and the (i,j) entry of M 
is the sum of the weights of the arcs from i to j. Then by the definition of matrix 
multiplication, the (i, /) entry in M* is the sum of the weights of all walks of k arcs 
from i to j. It follows that (as long as the infinite sum exists) Sy. M* = (1-M)"! 
counts all walks, where / is the identity matrix, and trace (J — M)~' counts walks 
that end where they begin. 

For example, consider the following problem: Given integers n and i, what is 
the number t(n,i) of sequences a,a2 ---a, of 0's, 1's, and —1’s with a, +--- +a, =i 
(mod 6)? Here we take D to be the digraph with node set {0,1,2,3,4,5} and an 


arc from each j to j — 1, j, and j + 1, reduced modulo 6. We weight each arc by x. 
So M is 


x x 0 00 x 
x x x 0 0 0 
Ox x x 0 O 
00.7 x x O 
0 0 0x x x 
x 00 0.x 


We find that (J - M)' is the circulant matrix with first column 


: + z + +2 
1~3x 1-2ne +x 
1 1 it 


(240° Loe Pax 
1 LS ee 1 
1) 1-3x 1-2x 14x 


1 1 ; I 
1—3x t—-2x IL+x 
1 1 1 


1—3x 1-2x Ilt+x 


Thus for n > 0, 


t(n,0) = (37 +2"! + (-1)")/6, 
t(n, 1) = t(n,5) = GB" + 2” — (-1)")/6, 
t(n, 2) = t(n,4) = (3% — 2" + (-1)")/6, 
t(n,3) = (3" —2"*' — (-1)")/6. 
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As another example, how many 0-1 sequences are there with specified numbers 
of occurrences of 00, 01, 10, and 11? Here we take D to be the weighted digraph 


Xo4 


X19 


Then 


u—myt= (1% ~X01 ie 


—X10 1—x1 
l-—xy Xo 
X10 1 ~ x00 
~ (1 = xo9)(1 = x11) — Xo1x10° 


Thus, for example, the generating function for 0-1 sequences beginning with 0 and 
ending with 1 is 


Xo1 = s x1 X10 
(1 — x00)(1 = x41) ~xorK10 Se (1 ~ x00) = ae) 
ati J ok LEtt\ fitk 
= So xti Hott ( F k } 
7 J 
ijk 


sO or) is the number of 0-1 sequences beginning with 0 and ending with 1, 
with / occurrences of 10 (and thus i+1 occurrences of 01), j occurrences of 00, 
and k occurrences of 11 (and thus i+ j +1 zeros and i +k +1 ones). 

The transfer matrix method can often be used to show that a generating function 
is rational. For example, consider the problem of counting the number of ways of 
covering, an m x nm rectangle with a fixed finite set of polyominos. It is not hard to 
show, using the transfer matrix method, that for fixed m the generating function on 


n is rational, although it is difficult to give an explicit formula. We will see another 
example of this type in section 10. 


8. Multisets and partitions 


We have so far considered problems involving linear arrangements. In this and the 
next section we turn to unordered collections. We first consider the problem of 
counting multisets, which are sets with repeated elements allowed. More formally, 
a multiset on a set S is a function from S to the nonnegative integers; if v is a 


weer te 
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multiset then »(s) represents the multiplicity of s. If each element s in S has a 
weight w(s), then we define the weight of the multiset » to be [],-5 w(s)’™. 

For each s in S, let Ms be a set of nonnegative integers. Then the sum of the 
weights of all multisets »y on S such that »(s) is in M, for each s in S is easily seen 


to be 
I ye w(s)!. 


seS ieM, 


We give a few examples. Let us take w(s) =x for all s in S, and assume |S| =n. 
If M, = {0,1} for each s, we are counting subsets, and the generating function is 


(l+x)" = ~ (ae 


k=0 


If M, is the set of all nonnegative integers for each s, we are counting unrestricted 
multisets, and the generating function is 


Cte et -otee(|)# 


k 
k=9 


If M; = {0,1,...,m} for each s, the generating function is 


_ yml\" 
(ltxte--4x7J"= (>) 


1-x 


=k 1) (") ae aan) 
- i k—(m+1)i 
k- i 
A multiset of positive integers with sum k is called a partition of k. The elements 
of a partition are called its parts. 't is customary to list the parts of a partition 
in decreasing order, so a partition of k is often defined as a (weakly) decreasing 
sequence of positive integers with sum k. To count partitions, we weight i by 
qi, where q is an indeterminate. Then the generating function for all partitions 
is []2,(1 —q')* and the generating function for partitions with distinct parts is 

PL 4+q'). 

Many theorems in the theory of partitions assert that one set of partitions is 
equinumerous with another. The simplest of these, due to Euler, is that the number 
of partitions of n with odd parts is equal to the number of partitions of n with 
distinct parts. To prove this, we note that the generating function for partitions 
with odd parts is 


[Ja -9' =]fa-¢)"' TJo-¢@) 
i odd i=1 j=t 
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which is the generating function for partitions with distinct parts. 

It is not difficult to give a combinatorial proof of this result: Suppose 7 is 
a partition with odd parts. If 7 contains the odd part i with multiplicity k, let 
k = 2% 42% +--+ +2, where 0 < e; < e) <---< e,. We now replace the k copies 
of part i by the distinct parts 2°i,2°i,...,2°i. Doing this to every part of a we 
obtain a partition 7’ with distinct parts. The correspondence is easily seen to be a 
bijection. For example, if 7 = {9,9,7,7,7,1,1,1, 1}, then a’ = {18, 14,7, 4}. 

One of the most famous results in the theory of partitions is the following. 


Theorem 8.1. The number of partitions of n with distinct parts in which any two 
parts differ by at least 2 is equal to the number of partitions of n with parts congruent 


to 1 or 4 (mod 5). 
This result follows easily from the Rogers-Ramanujan identity 
j2 ow 


q' ] 
2 (1-4) —q’)---Q—q') © II Q—@ "ya — qi’) 


j=0 


No simple bijective proof of Theorem 8.1 is known. A complicated bijective proof 
was found by Garsia and Milne (1981). 

We now prove an identity called the q-binomial theorem, which has many ap- 
plications to partitions. We introduce the notation (a), for (1 — a)(1 — aq)---(1—- 
aq"~'), where q is understood. In particular, (q),, = (1 — q)(1 — q”)--- (1 —q”). We 
also write (a). for []j2y(1 — aq‘). 


Theorem 8.2 (The g-binomial theorem). 
57 an (4)n f= (at). 
24 @)n ae” 
Proof. Let 
(at) pn 
= nl. 
a 


n=0 


Then 


(at)oo _ = (aloo _ a = n 
fae ees! Da tat 


But also, 


(@t)oo _ (td oo, are 
aS = (1 —at) ae =(1 at) Df a 


Equating coefficients of 1”, we have 


Sh —fu C= a" fa —aq" fi 1 A 


WV 
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and thus f,(1 —q") = fn-i(i — aq”). Since fo = 1, this gives 


= 1 - aq‘! = (@)n 
fee ae 


i=l 


Two cases are particularly worth noting. If a = q’, where m is a positive integer, 
then we have 


(Qa in 1 
ae ae (83) 


The q-binomial coefficient is defined to be 


| - aa 


Since (q”")n = (Q)mnin—1/(Q)m-1 we may rewrite (8.3) as 


Pee 1 (8.4) 


= mV)" 
an (1) —1q)---( 1g") 
It follows from (8.4) that [Z| is a polynomial in q that reduces to the binomial 
coefficient ({) for q = 1. 
We can use (8.4) to count partitions with at most n parts, each part at most m. 
It is clear that the desired generating function is the coefficient of t” in 
1 _ 1 
(1 —t)(1 — tq) --- (1 ~ tq”) (mi 
and by (8.4) this is [’"*"]. 


The case a = q’™ of the q-binomial theorem yields similarly (after changing g 
to gq"! and ¢ to -t/q) 


yoig® a =(1+s)(1+tq)---(1+tq""'), (8.5) 
n=0 


which implies that the generating function for partitions with distinct parts, all 
less than m, where 0 is allowed as a part, is q aR This result may be derived 
directly from our previous generating function for partitions with repeated parts 
allowed, since every partition with distinct parts is obtained uniquely from an 
unrestricted partition by adding 0 to the smallest part, 1 to the next smallest, and 
so on. 

There is an important interpretation for g-binomial coefficients in terms of vector 
spaces over finite fields. (See, e.g., Stanley 1986, p. 28, for the proof.) 


Theorem 8.6. Let q be a prime power. Then the number of k-dimensional subspaces 
of an n-dimensional vector space over a field with q elements is |]. 


A comprehensive reference on the theory of partitions is Andrews (1976). 
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9. Exponential generating functions 


If ag,ay,... is a Sequence of numbers, the power series 
oO x" 
ean 
n! 
n=0 


is called the exponential generating function for the sequence. Exponential gener- 


ating functions arise in counting “labeled objects”. Their usefulness comes from 
the fact that 


x” x? > m + n ge 

mint ( m hear 
If A is an object with label set [7m] and B is an object with label set [n], we can 
combine them in ('"*") ways to get an object (A’,B’) with label set [m+n]: We 
first choose an m-element subset S of [m+n] and replace the labels of A with the 
elements of S (preserving their order) to get A’, and in the same way we get B’ 
from B and [m+n}\S. 

Thus if f(x) and g(x) are exponential generating functions for classes of labeled 

objects, then their product f(x)g(x) will be the exponential generating function for 


ordered pairs of these objects. For example, the exponential generating function 
for nonempty sets is 


since the elements of [n] can be arranged as a nonempty set in one way if n > 0 
and in zero ways if n = 0. Thus 


CO 


(ef -1)? =e" -95 


n=2 


is the exponential generating function for ordered partitions of a set into two 
nonempty blocks. More generally, (e* -- 1)* is the exponential generating function 
for ordered partitions of a set into k nonempty blocks, and 


=~ 1 
*_ {jf = 


is the exponential generating function for all ordered partitions of a set. 

Now suppose that f(x) is the exponential generating function for a class of 
labeled objects and that f(0) = 0. As we have seen, f(x)* is the exponential gen- 
erating function for k-tuples of these objects. Every k-set can be arranged into a 
k-tuple in k! ways, so f(x)‘ /k! is the exponential generating function for k-sets of 
these objects. 
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Thus, for example, (e* — 1)*/k! is the exponential generating function for par- 
titions of a set into k blocks. The numbers S(n,k) defined by 


aaa =e = se sh) (9.1) 


are called Stirling numbers of the second kind. If we sum on k we obtain the expo- 


nential generating function exp(e* — 1) for all partitions of a set. The coefficients 
B,, defined by 


ore] 
yas 
n! 
n=0 


are called Bell numbers. 

In general ef) counts sets of labeled objects each counted by f(x). Another 
important application of this principle (often called the “exponential formula”) 
is to the enumeration of permutations by cycle structure. A permutation may be 
considered as a set of cycles. If we weight a cycle of length i by u; and weight 
a permutation by the product of the weights of its cycles, then the exponential 
generating function for cycles is 


oO oa 
Be x" 
y (n —1)!u,— = y Un— 
n! n 
n=1 n=1 


and thus the exponential generating function for permutations by cycle structure 
is 


exp up,x" /n). 


n=1 


If we set u, =u for all n, then we are counting permutations by the number of 
cycles, and we obtain the generating function for the (unsigned) Stirling numbers 
of the first kind, 


eo) yn 


(lx) “= So u(u+l)-- (utn—1)5 = ye 7 clr ku, 


n=0 n=0 " k=0 


which we derived in a different way in section 3. 

In some cases, there is a simpler expression for e/*) than for f(x). For example, 
any labeled graph is a set of connected labeled graphs. Thus if g(x) is the exponen- 
tial generating function for connected labeled graphs, then e&“) is the exponential 


generating function for all labeled graphs. But there are 2°) labeled graphs on [n], 
so 


oO z x" 
g(x) = log & 205) : 


1042 LM. Gessel and R.P. Stanley 


Exponential generating functions often satisfy simple differential equations 
which can be explained combinatorially. If 


fx) = fe, 
n=-0 
then 
FO) =D hoagie 


so an object counted by f’(x) with label set [n] is the same as an object counted 
by f(x) with label set [+ 1]. For example, let 


pla) = Son = = 


1-x 
n=0 


be the exponential generating function for permutations (considered as linear ar- 
rangements of numbers). Then f’(x) counts permutations of {n+ 1] in which only 
the numbers in [7] are considered to be labels. We can consider n+1 to be a 
“marker” that separates the original permutation into a pair of permutations on 
[n], and we obtain the differential equation f’(x) = f(x)’. This decomposition can 
be used to obtain more information about permutations, as we shall see next. 

A descent of the permutation a,a2---a, is ani for which a; > a;,,. It is convenient 
to count 1 as a descent also, if 2 > 0. Let 


A(x) = So An( 
n-0 . 


be the exponential generating function for permutations by descents, where a per- 
mutation with k descents is weighted ¢*. If we take a permutation 7 = a\a)---@ns| 
on [n+1] and remove the element n+1, we are left with two permutations, 
™ = aa2-+-aj, and mW = aj44--+ay_,1, Where aj =n+1. The number of descents 
of 7 is the sum of the number of descents of 7 and 7, unless 7, is empty, when 
a has an additional descent. Thus we obtain the differential equation 


A'(x) = (A(x) - 1)A(x) +tA(x), 


together with the initial condition A(0) = 1. The differential equation is easily 
solved by separation of variables, yielding 


1-t 


AQ) = Tete 


The polynomials A,,(¢) are called Eulerian polynomials and their coefficients are 
called Eulerian numbers. 
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As another example, let us define an up-down permutation to be a permutation 
@\@2°+-d, Satisfying a, <a) > a3 < a4-+-2a,. Let D, be the number of up-down 
permutations of [n] and let 


T(x) = So Dan 

na-0 
Removing 2n + 1 from an up-down permutation of [2n + 1] forn > 1 leaves a pair of 
up-down permutations of odd length. Taking into account the exceptional case n = 
0, we obtain the differential equation T’(x) = T(x)? +1, with the initial condition 


T(0) = 9. Solving the differential equation yields T(x) = tanx. The numbers D>, 
are called tangent numbers. 


For the generating function 
S(x) = 2 Pon apy 


a similar analysis yields the differential equation S’(x) = T(x)S(x), with the initial 
condition S(0) = 1, which has the solution S(x) = secx. The numbers D), are called 


: yenet 
(2n +1)! 


secant numbers. We will show that S(x) = secx by a different method in section 


11. 


Another application of exponential generating functions is to the enumeration 
of labeled rooted trees. Since a rooted tree can be represented as a root together 


with a set of subtrees, the exponential generating function ¢(x) for rooted trees 
satisfies 


t(x) = xel™. 
We can solve this equation by the Lagrange inversion formula, and we obtain 


t(xyK Sg (m1) 2x" 
ki! =" k—-t) nt? 


n= 


which for k = 1 gives a formula equivalent to Cayley’s. 


10. Permutations with restricted position 


In this and the next several sections we discuss methods for dealing with formulas 
that involve subtraction. One way to deal with such formulas is to replace them 
with equivalent formulas having only positive terms. The example we give here is 
based on the fact that the formula 3°, A,t* = >, B,(t — 1)* is equivalent to the 
formula >, Ay (t+ 1)k = >>, Bath. 


Theorem 10.1. Let R be a subset of |n) x {n|. For any permutation 1 of |n), let r(a) 
be the number of values of i € {n| for which {i, 7(i)) € R. Let 


a(t) = Sat = 55, 
k=0 


ne, 
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where &,, is the set of permutations of |n}. Let b, be the number of k-subsets of R 
in which no two pairs agree in either coordinate. Then 


a(t) = Soa ln —k)(t—1)*. 


k=0 
In particular, 


ay = a(0) = sh —kyi(-1y*. 


k=0 
(Note that b, is just the number of k-element matchings in the bipartite graph 
determined by R; see also chapters 3 and 31.) 


Proof. We prove that 


a(t+1) =) by (n— ke 
k=0 
by counting in two ways pairs (7, Q), in which a7 € ¥, and Q C G(7) NR, where 
G(m) = { (i, 7) | FE [nx] }. We weight such a pair by 1/2. 
First, we have 


Ss raed — > {!al = yee + 1)16GnR| = a(t + 1). 
(7,Q) 7 QCG(m)NR 7 
Second, we have 
> 7/2] SI a | G(r) 2 Q} (2. 
(7,Q) OCR 


If G(r) > Q then Q does not contain two ordered pairs which agree in either 
coordinate, and if this condition is satisfied, @ can be expanded to the graph of a 
permutation in (n —|Q])! ways. Thus the sum is equal to )*;_95,(n —k)!t*. oO 


Theorem 10.1 is often proved by inclusion-exclusion, which we discuss in section 
12. See, c.g., Riordan (1958, chapters 7 and 8). 


For our first example, let R = {(f,/) | i € ([n|}. Then a(t) counts permutations 
by fixed points. Here by = (7), so 


a(t) =n! Eo" 
k=0 ‘ 


and in particular, ap =n! 57y_9(—1)*/k! is the number of derangements (permuta- 
tions without fixed points) of {n], denoted d,,. 

Next we consider the case R = { (i,j) |i — j =0 or 1 (mod n) }, which is the clas- 
sical probléme des ménages. Here we can evaluate b, by a simple trick, but the 


generalizations in which i — j = 0,1,...,s (mod 7) could be solved by the transfer 
matrix method. 
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Let us set p2;.; = (i,i) for 1 <i <n, po; =(i,i+1) for 1 <ign—1 and py, = 
(n,1). Then b, is the number of k-subsets of {p;,...,P2.} containing no p; and 
Pis1 (OF Po, and p,). Then as we saw in section 5, 


b, = In (2n—k 
Ke _n-k\ Kk OS? 


and thus 
wo = Yt- 1 cal ier “on —W. 


k= 


Finally, let us ae R= {(i,j) |i > j}. Then b, is the Stirling number S(n,n — k). 
We prove this by giving a bijection between k-subsets of R counted by b, and 
partitions of [n] with n — k blocks: to the subset {(é),/1), (iz, j2),---5 (igs ie) } Of R 
counted by b, we associate the finest partition in which i, and j, are in the same 
block for cach s. Thus 


a(t) = S>s(n0 —k)(n k(t — 1. 


k=0 


If we call this polynomial a,(¢), then a straightforward computation using (9.1) 
shows that 


1—t 
| -1 
5 ant) ~sae Dy. =1+t ( 1+>— ret =) +} tA 562 ni? 


n=0 


where A,,(¢) is the Eulerian polynomial. 


Similarly, if we had taken R = { (i,j) | i > j }, we would have found a(t) = A,(t) 
for all n. 


Thus for 2 1 the three polynomials 
> pili aG)> alist) H > Hl27OH and > tl{lima 
mES, TES y TES y 


are all equal. A combinatorial proof is easily found through Foata’s transformation: 
For example, if 


w= 572-16-38-4- 
(where the dots represent descents) then Foata’s transformation takes 7 to 
m = (57-)(2-)(1 6-38-4-), 
in which occurrences of w(i) > w(i +1) together with the extra descent at the end 


have been transformed into occurrences of i > 7; (i). 


The variant of Foata’s transformation with left-right maxima instead of minima 
transforms 7 to 


my = (5)(7-2-16-3)(8-4), 


in which occurrences of 7(i) > w(i+1) have been transformed into occurrences 
of i > 7(i). 
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11. Cancellation 


In this section we consider a technique for simplifying sums of positive and negative 
terms by cancellation. We have two sets A* and A”, which we think of as “positive 
objects” with sign +1 and “negative objects” with sign —1. We want to find a 
combinatorial interpretation to |A*| — |A~|. We do this by finding a partial pairing 
of positive objects with negative objects; then |A*|—|A7| will be equal to the 
contribution from the unpaired objects. 


Theorem 11.1. Let A = At UA™ and suppose that there is a subset B of A and an 
involution w defined on A\ B which is sign reversing: if w(x) is defined, then x € At 
if and only if w(x) € A~. Then |At| —|A7|=|AtN B| -|A7 N Bl. 


In most (but not all) applications, B is a subset of either A* or A™. 
As an example, we give a combinatorial proof of the identity 


Een) =cmaea) 


Let us first consider the special case m = 0, which we may write as 


a (2 1, n=0, 
pa () = e n>0. 
k=0 

It is clear that we should take A to be the set of subsets of [n], with A* the 
subsets of even cardinality and A~ the subsets of odd cardinality. For n > 0 we 
want to find a sign-reversing involution on all of A, so that B = @. Clearly the map 
given by w(K) = K A {1} has the right properties, where A denotes the symmetric 
difference. 

Now we consider the general case. Let R be an r-element set disjoint from 
[n]. We may take A to be the set of all pairs (K,M), where K is a subset of [n] 
and M is an m-subset of RU K. Then the number of such pairs with |K| =k is 
(7)("*). We take the sign of (K,M) to be (—1)!41. It is not immediately obvious 
what B should be, but we may try to construct a sign-reversing involution on as 
large a subset of A as possible, and B will be whatever is left over. Given a pair 
(K,M)€A, let j be the least element of [n]\ M if [n]\M is nonempty. Then we 
set w((K, M)) =(K A {i}, M). This is clearly a sign-reversing involution. The only 
pairs (K,M) for which it is not defined are those for which [n] C M. But if [n] C M 
then since M M(n] C K, we must have K = [n] and M must consist of [n] together 
with an (m — n)-subset of R. There are (,,”_,) of these and they all have sign (—-1)”. 
Thus the identity is proved. 

For our next example, let D,, be the number of up-down permutations of [n], 
as defined in section 9. We give a completely different proof that 
had xn 
dP aay = Secx, (11.2) 


n=O 
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If we multiply both sides of (11.2) by cosx and equate coefficients of x7"/(2n)!, 
we see that (11.2) is equivalent.to the recurrence 


n 
2n 1, ifn=0 
—1)r-k _ fl, : 
2) (1) Pa 0 otherwise. (11.3) 


The case n = 0 of (11.3) is clear. To interpret (11.3) for.n >, let A be the set of all 
ordered pairs (a, 8) such that for some subset S C {2n] of even cardinality, a is an 
up-down permutation of S and 8 is the increasing permutation of [27] \ S. If |S} has 
cardinality 2k, then we give (a, B) the sign (—1)*. Thus for n = 8, a typical element 
of A is (1427,3568). Now let y = (@)a2--- 49%, b,b2---B2,_2,) be an element of A. 
If an, > Db, or k= 0, we define ow(y) to be (a, az @ - Ax, b2, bs as ~bon_2K) and if 
arn < by or k =n we define wy) to be (ajaz oe + An -2, 4214245) b2b3 Me -Don_2K)- It 
is clear that is a sign-reversing involution defined on all of A, and thus (11.3) is 
proved. The formula 


oo 


etl 

> Donat Qn+1)! = tanx 
n=0 

can be proved similarly. 

Following Zeilberger (1985), we now use a sign-reversing involution to prove 
the “matrix tree theorem”, which gives a determinantal formula for the number of 
spanning arborescences of a digraph, rooted at a given node. Similar proofs have 
been found by several people, of whom the first seems to be Temperley (1981). 

For each i, j, with 1 <i, j <n, let w,; be an arbitrary weight. We define the weight 
of a digraph on |n} to be the product J] ;; over all arcs (i,j) of the digraph. We 
shall find a formula for the sum of the weights of all arborescences on {n], rooted 
at n. (Then given any digraph D on [nj, the number of spanning arborescences of 
D is obtained by setting w;; equal to 1 for arcs (i,j) in D and to 0 for arcs not in 
D.) 

First we observe that a determinant can be interpreted as a sum of signed weights 
of digraphs. A permutation digraph is a digraph in which every vertex has indegree 
and outdegree |, or equivalently, in which every weakly connected component is a 
directed cycle. Any permutation 7 of a set corresponds to the permutation digraph 
in which the arcs are (i, 7(i)), and conversely, every permutation digraph is of this 
form. Now let M be the matrix (—w,;);;<1,..,-1- Then the determinant of M is 
equal to the sum over all permutations 7 of |[n — 1} of 


n-l 
(sgn m7) | [(—Winty)- (11.4) 
ist 
This product is clearly, up to sign, the weight of the permutation digraph corre- 
sponding to a. Now suppose that a has r cycles, of lengths 4), b, ..., 4. Then 
sen a = []j_,(-1)"*! and (~1)""! = []}.,.(-1)!, so (11.4) is (—1)’ times the weight 
of the permutation digraph corresponding to 7. 
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Now consider the determinant 


Wap t--> + Wa ~W2 —Wa-1) 
—Wi2 Wi2 + W32 te - + + War ee —Wr-i2 
Ww — 
—Win-t —Wra l tee Win pte tw, 2a it + Win t 


This is the determinant of M above, with w;; replaced by 


The digraphs that W counts will be obtained from permutation digraphs by re- 
placing each loop (é,/) with an arc (j,/) for some j 4i (with j = 7 allowed), and 
the sign of such a digraph is (—1)’, where r is the number of cycles of length at 
least 2 in the original permutation digraph. More precisely, W is the sum of the 
signed weights of all pairs (P, 7) of digraphs on [nm] such that: 

(1) P is a permutation digraph, with every cycle of length at least 2, on a set of 
nodes Np C [n — 1]. 

(2) T is a digraph without loops on [mn] in which every node in [n — 1]\ Np has 
indegree 1 and every node in Np U {n} has indegree 0. 

The signed weight of the pair (P,T) is (—1)’ times the product of the weights 
of P and T, where r is the number of cycles of P. Here is a typical pair (P, T): 


We now define the sign-reversing involution w on all pairs (P, 7) such that 
either P or T contains a cycle: take the cycle containing the least vertex and 
transfer it from P to T or from T to P. Then w is a weight-preserving sign- 
reversing involution that cancels all pairs except those in which P is empty and T 
is an arborescence rooted at n. 


Further examples of cancellation can be found in Stanton and White (1986). 


em aspen ne A YR RL 
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12. Inclusion—exclusion 


The inclusion-exclusion principle is probably the most well-known technique for 
dealing with subtraction. 


Theorem 12.1. Let f and g be two functions defined on the subsets of a finite set S 
such that f(A) = Y3gc48(B).- Then g(A) = Digcea(-1)4 BI f(B). 


Proof. We have 
So (-1)* FFB) = SO (14 Bg(C) 


BCA BCA 
CCB 
=>oa(C) SS (1) =g(4). oo 
CCA CCBCA 


A dual form of inclusion—exclusion may be proved the same way as Theorem 
12.1: 


f(A) = 5> g(B) if and only if g(A)= $0 (-1)!7-41f(B). (12.2) 


SDBDA SDBDA 


An important special case of inclusion—exclusion occurs when f(A) and g(A) de- 
pend only on |A|, so we may write f(A) = fia; and g(A) = gj). Then the relation 
between f and g may be written 


n 


=D @r and ga = ree ({)h 


k=0 


These relations may be expressed in terms of exponential generating functions: 
if F(x) = Og fxx"/nt and G(x) = yg gnx"/n! then F(x) = e*G(x) and G(x) = 
e* F(x). 

Another form of inclusion—exclusion is often used: Suppose we have a finite set 
X of elements, each of which has certain “properties,” and let S be the set of all 
such properties. For each subset A of S let f(A) be the number of elements of X 
having all the properties in A (and possibly others). 


Theorem 12.3. Let M; = vali f(A) and let N, be the number of elements of X 
having exactly i properties. Then 


m= Dev" (i)m, 
Ii 


and in particular, 


No = My — My + My ~ +>. 
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Proof. For A CS, let g(A) be the number of elements of X having the prop- 
erties in A and no others. Then f(A) = BDA g(B), so by inclusion—exclusion, 


8(A) = Dpoa(- 1)? -41f(B). Thus Ni = Y>)4)-;8(A) and the result follows by a 
straightforward calculation. © 


Our first example of inclusion-exclusion is to permutation enumeration. The 
descent set D(2r) of a permutation 7 of [n] is {i | w(é) > w(i +1) }. Fix n, and for 
A G {n — 1], let g(A) be the set of permutations with descent set A. We shall find a 
simple formula for f(A) = 0y-48(B). Let A = {a, <a, <--- < ay}. Then D(a) C 
A if and only if w(1) < m(2)--- < way), (ay +1) < +--+ < wa), .-., wWlay +1) < 
+++ < a(n). To construct such a permutation 7, we choose a, elements of {n]| to 
be {w(1),..., 7(a;)} and arrange them in increasing order, then choose a2 — a; of 
the remaining elements to be {7(a, + 1),..., (a2)}, and so on. Thus f(A) is the 
multinomial coefficient 


( : ) 
|, Q2 ~ Qy,..-, Ay — Ay_1,N — Ay 


so g(A) is given explicitly by g(A) = Dxca(-1)4 FIf(B). 

If we set ag =0 and a,,,; =n, then g(A) can be expressed compactly as the 
determinant 
1 


n! | -—_——__-—__ 
(a; —a;_1)! 


; (12.4) 

igot,...k4+l 
where we interpret 1/r! as 0 for r < 0. To see this, suppose that (m;;) is an r x r ma- 
trix for which m,; = 0 if j < i—1. Then, if []}_, mina #9, every cycle of 7 must be 
of the form (f ¢- 1 --- +15). If in addition m;;_; =1 for2 <i <r, then the con- 
tribution to the determinant |m,;| from the permutation (f) #) — 1---2 1) (t2--- 4) 4+ 
1) +++ (+++, +1), where ) <th <--) <p =r, is (-1) my my styy My sty 
We obtain (12.4) by taking r = k +1, mj = 1/(a; — a;_1)!. 

As another example, we find a formula for the number c,, of cyclic permutations 
w of [n] satisfying ar(i) #i+1 (mod n). For any subset A of [n] let f(A) be the 
number of permutations w with w(i) =i+1 (mod n) for all i in A and let g(A) 
be the number of permutations 7 with w(i) =i+1 (mod n) for alli in A but for 
no other i. Thus c, = g(@). Then it is clear that f(A) = )>p5,, 8(B), so by (12.2), 
8(A) = opo,(—1)!7-41f(B). It is easily seen that f(A) = (n — 1 — |Al)! for |A] <n, 
with f({7]) = 1. Thus 


n-1 
Cn = (-1)" + S(—-1) (i) (n—-1—k)!. 
k=0 


If instead of considering only cyclic permutations, we counted all permutations 7 
satisfying a(i) #i+1 (mod n), we would have obtained the derangement number 
d,. The numbers c,, are closely related to the derangement numbers; it can be 
shown that dy = Cn +€ni1 and Cayy = (-1)"*1 + Sf _g(-1)" kd, 
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13. Mobius inversion 


Consider the following problem: out of 100 students who are taking Algebra, 
Biology, and Chemistry, 23 have Algebra and Biology at the same time, 40 have 
Algebra and Chemistry at the same time, 42 have Biology and Chemistry at the 
same time, and 15 have all three courses at the same time. How many students 
have no schedule conflict? : 

We can solve this problem by inclusion—-cxclusion. Let U be the set of all 100 
students. Let S, be the subset of students with an Algebra—Biology conflict, and 
similarly for S) and S3. Then the answer is 


IU] — SC ISI + $5 1S; S)| ~ |S, 5283]. 
i i<j 
But in this case 
|S anys! = |S; N S3| = 152 S3| = |S; NS nS3{, 


so the formula reduces to 
JU| — [Si] — {S2| ~ |S3| +2|S19 S.A S3| = 25. (13.1) 


The theory of Mobius inversion explains formulas like (13.1), and in particular 
explains the significance of the coefficient 2. In this problem there are 5 possibili- 
ties for a student’s schedule conflict: no conflict, A-B conflict, A~C conflict, B~C 
conflict, and A-B-C conflict. These conflicts are partially ordered in a natural way 


as follows: 
—— 


Then if we let g(x) be the number of students with conflict of type x (but no 
worse), then we want to determine g(no conflict) given f(x) for all x, where f(x) = 


Diya 80). 
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In the general situation, we have a finite poset P and two functions f and g on 
P related by 


fx) = 80), | (13.2) 


y2x 


and we want to find the coefficients m(x,y) which express g in terms of f; 


a(x) = So m(x, y)f(y). (13.3) 
y2Xx 
It is convenient to consider the problem from a slightly different point of view. 
First let P be a finite poset. The incidence algebra $(P) of P is the set of all 
complex-valued functions f on P x P such that f(x,y) =0 unless x < y. Addition 
of these functions is pointwise and multiplication is defined by the formula 


(fixsy)= So flxz)e(z,y). 


xgz<y 


S(P) is isomorphic to an algebra of matrices in which the rows and columns are 
indexed by the elements of P; the function f corresponds to the matrix in which 
the (x, y) entry is f(x,y). If the rows and columns are ordered consistently with P, 
then these matrices will all be upper triangular. In particular, if f(x,x) is nonzero 
for all x, then f is invertible. 

There are three particularly important elements of the incidence algebra. First 
there is the identity element 6 defined by 


_ Jil, ifx=y, 
any) ={ 9) ifx #y. 
Next is the zeta function ¢ defined by 


_ jl, ifx<y, 
non={y otherwise. 


The Mébius function p» of P is the inverse of ¢. By the remark above, » must 
exist. An easy way to compute is from the recurrence 


BO,y) = a > u(x, 2), 


xazr<y 


for x < y, with the initial condition (x, x) = 1. This recurrence follows immedi- 
ately from the formula ué = 6. 


It is easy to give a formula for je(x,y). We have £-' = (8+ £— 5)! It is clear 
that (¢ — 5)‘(x, y) is the number of chains x = x9 < x} <--- <x, =y and thus is 
zero for k sufficiently large. So we have the explicit formula 


w= (8+E~8)) "= SE — a, 
k20 


where only finitely many terms on the right are nonzero. If we define the length 
of a chain to be one less than its cardinality, we have P. Hall’s theorem: 
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Theorem 13.4. (x,y) = Cy — Ci + Cy — -- +, where C; is the number of chains of 
length i from x to y. P 


Hall’s theorem implies that (x,y) depends only on the interval [x,y] = {z | 
x <z<y}. An important, but less obvious, aspect of Hall’s theorem is that it 
provides an interpretation of the Mébius function of a poset P as the reduced 
Euler characteristic of a topological space associated with P,, and thus allows the 
machinery of algebraic topology to be applied to the study of posets. (See, e.g., 
Stanley 1986, pp. 120-124 and 137-138.) 

Let us return to our original problem. We claim that in (13.3) we should take 
m(x,y) = #(x,y). To see that this works, set 


&(x) = >> wx, y) f(y). 


y2x 


Then we have 


3580) = 3530 20,2f@) = SO fR) YO Le y)uO,z) = fo). 


y2x y2x z2y zZ2Xx XEVKZ 


Since g is uniquely determined by (13.2), we must have g = g. 
There is a dual form of Mobius inversion in which y > x is replaced by y < x. 
We state both forms in the following theorem. 


Theorem 13.5. Let f, g, and h be complex-valued functions on the finite poset P. 
Then 


(a) f(x) = Ly.8() if and only if g(x) = Vs, HO Y)fO)s 
(b) A(x) = Dy <.8(y) if and only if g(x) = D0, <A) HO, *)- 


If P and Q are posets, then the product order on P x @Q is given by (p1,q1) < 
(P2,42) if and only if p; < p2 and q; < q2. The Mébius function of P x Q is easily 
expressed in terms of the Mébius functions of P and Q (the straightforward proof 
is omitted). 


Theorem 13.6. Let P and Q be finite posets. Then 
Hpxo((P1,491); (P2,92)) = HP (P1, P2)H0(N1, 92)- 


It is easily seen that if we consider the set [n] as a poset under the usual order, 
so that it is a chain, then 


1, ifi=yj, 
eG,j)= 4-1, ifi+1=j, 
0, otherwise. 


Since the poset of subsets of a set is a product of 2-element chains, we find that its 
Mobius function is given by (A, B) = (—1)!8~4! for A C B, which with Theorem 
13.5 is the inclusion-exclusion formula. 
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We now prove two theorems on Mobius functions of lattices. A poset P is a 
lattice if any two elements x,y € P have a unique join, or least upper bound, 
denoted x V y, and a unique meet, or greatest lower bound. We assume that all 
posets are finite, so any set S of elements of a lattice has a join which we denote 
by \/S. We denote the unique minimal element of a lattice by 0, and the unique 
maximal element by i. An atom is an element that covers 6. 

In our example we computed a Mobius function by using inclusion—exclusion. 
The next theorem generalizes that example, though we give a different proof. 


Theorem 13.7. Let P be a lattice. Then (0,x) = S>,(—1)*!, where S ranges over 
all sets of atoms with join x. 


Proof. For each x in P, let g(x) = 3>s_,(—1)!, where S ranges over sets of atoms. 
Define f(x) by f(x) = Dy. 8(¥) = Divyey(-1)"!. Then, if A is the set of atoms 
less than or equal to x, we have 


_Sy_psi a J1 rer if x =0, 
cies Z {0 ifA#O LO, ifx 46. 


Then by Mobius inversion, g(x) = yar fH, x) = 2(0,x). O 


Corollary 13.8. Under the above hypothesis, if x is not a join of atoms, then 
p(0, x) = 0. 


Next we prove another basic result on Mobius functions of lattices, called Weis- 
ner’s theorem. 


Theorem 13.9. Let P be a lattice. Fix a and x in P, with a>. Then 


Dd #@,z) =0. 


ZVa=xX 
Proof. For fixed a, let g(x) = °..,,_, 20, z), and set 


fe) = Soe) = YS w(6,2). 


yeu zVagx 


We shall show that f(x) =0 for all x, which implies that g(x) =0. If a < x then 
f(x) is clearly 0. If a< x then x > a>, so f(x) = Does 2(6,z)=0. O 


Corollary 13.10. Let P be a lattice. Suppose that 

(i) P has a rank function p with the property that if a is an atom, then for all x 
in P, p(aVx) < p(x) +1. 

(ii) Every element of P is a join of atoms. 

Then (—1)° p(6, 1) > 0. 
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Proof. The assertion is trivially true if 0 =f. Otherwise, in Theorem 13.9 let a 
be an atom and take x = 1. Then if zVa= i, Z must be 1 or a coatom (of rank 
p(1) — 1). So (0, i) = -yv. (0, z), where the sum is over all coatoms z with 
zVa=1. The assertion will follow by induction if we can show that a may be 
chosen so that there is at least one such coatom. But if the sum is empty for all a, 
then every atom is less than or equal to every coatom, contradicting (ii). D 


Lattices satisfying the conditions of Corollary 13.10 are called geometric lattices. 
(There are many other equivalent characterizations of geometric lattices.) 

We can use Theorem 13.9 to compute the MObius function for the lattice L,, of 
subspaces of the vector space V,, of dimension n over a finite field of g elements. 
Since the interval [x, y} is isomorphic to L,,, where m = dim y — dim, it is sufficient 
to compute (0,1) in L,, which we denote by mn. 

As in Corollary 13.10, let us take a to be an atom and take x = 1. Then, if z isa 
coatom for which z Va = i, z must be a subspace of V, of dimension m — 1 which 
does not contain a, and the number of these is ea = | = q""'. Thus we have 


the recurrence 4, = —q”!,_,. From this recurrence and the initial condition 
Mo = 1, we obtain 


fin = (—1)"q. (13.11) 


As an application of (13.11), we compute the number g(x) of m-tuples of elements 
of V, which span a given subspace x. Let f(x) = Dy<x 8’). Then, if dimx = d, we 
have f(x) = q@”", so by Mébius inversion we have 


d 
. «fd 
g(t) = POH.) = Drag ah 

YE k=0 
Using (8.5), we can simplify this to 

d-1 

a(x) = [[@" -49*), 
k=0 


which can also be found directly. Similarly, the number of m-subsets of V, with 
span x (of dimension d) is 


z ve _, qed 
2, (“ Jay a ‘il: 


Rota (1964) initiated the systematic use of MObius functions in combinatorics. 
Further information about them may be found in chapter 3 of Stanley (1986), and 
in chapter 31 of this Handbook. 
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14. Symmetric functions 


A formal power series in the variables x,,x2,...,%n is called symmetric if it is 
invariant under any permutation of the variables. It is convenient to work with 
infinitely many variables, allowing sums such as x, + x2 +---. These symmetric for- 
mal power series are traditionally (but somewhat misleadingly) called symmetric 
functions. 

A symmetric function is homogeneous of degree k if every monomial in it has 
total degree k. It is clear that every symmetric function can be expressed as a (pos- 
sibly infinite) sum of homogeneous symmetric functions. If we take our coefficients 
to be complex numbers, then the homogenous symmetric functions of degree k 
form a vector space, denoted A‘. There are several important bases for A‘, which 
are indexed by partitions of k. If A = (A,,...,An) is a partition of k (with the parts 
listed in decreasing order), then the monomial symmetric function m, is defined 
to be the sum of all distinct monomials of the form x/" -+ aj" for permutations 
(a1 y+++)@,) Of A. It is clear that the mm, over all partitions A of k, form a basis for 
AY. 

For each integer r > 0, the rth elementary symmetric function e, is the sum of 
all products of r distinct variables, so e) = 1, and for r > 0, 


eé; = ) Xj, Xi, °* + Xi,- 
by Sig <i, 


For any partition A = (A,,A2,...) we define e, = e,,e,, --:. The “fundamental the- 
orem of symmetric functions” implies that the e, over all partitions A of k form a 


basis for A*, or equivalently, that every element of A* can be expressed uniquely 
as a polynomial in the e,. 


The rth complete symmetric function h, is the sum of all monomials of degree 
r, So hg = 1 and for r > 0, 


h, = > Xi, Xi, °° Xi,- 


HRN Ky 


The rth power sum symmetric function is 
Pr= DUM. 
i 


For any partition A = (Aj,A2,...), we define #y = hy,ha,--- and pa = Pa,Pars-- 
The generating functions 


ya! = Il 1 ~ eee (x te) 
r=0 i=1 r=1 


and 


-1 


Saat = To +xjt)= (Saw 


r=0 i=1 r=0 
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are easy to derive. They imply that e, can be expressed as a polynomial in the 
h; and also in the p;, and thus {Ay},-, and {pa}, are both bases for A‘. (Here 
A+ k means that A is a partition of k.) 

There is another important basis for AX which is less obvious. If A is a partition 
with n parts, we define the Schur function (or S-function) s, by 


5, = det(hy,-inj)i<ijcn (14.1) 


where we take h,, = 0 for m < 0. 

The Schur functions (in a finite number of variables) arise very naturally from 
irreducible representations of general linear groups. The irreducible polynomial 
representations of the general linear group GL, (over the complex numbers) may 
be indexed in a natural way by partitions with at most n parts. If y* is the char- 
acter of the representation associated with A, then for any matrix M in GL, with 
eigenvalues x;,X2,...,Xn, we have y*(M) = s,(xq,-.-,Xn)- 

The expansions of the s, in the other bases for A* are all interesting. The 
expansion in elementary symmetric functions is a determinant similar to (14.1). 

The expansions of Schur functions in power sum symmetric functions are related 
to irreducible representations of symmetric groups. There is a natural way of asso- 
cialing to cach partition of k an irreducible representation of the symmetric group 
SL, (see also chapter 12). Let us denote by x* the character of the representation 


associated with A, and by xh its value at an element of Y, of cycle type p. Then, 
if A is a partition of k, 


Ss = yx Ze, 


ptk Zp 


where Zp = []j51/ mij! if p has m; parts equal to i. 

The coefficients of s, (which give its expansion into monomial symmetric func- 
tions) have an interesting combinatorial interpretation. The Ferrers diagram of a 
partition A is an arrangements of cells with A; cells, left justified, in the ith row. 
Thus the Ferrers diagram of the partition (4, 3, 1) is 


A column-strict plane partition of shape d is a filling of the Ferrers diagram of A 
with positive integers which decrease weakly from left to right and strictly from 
top to bottom. For example, 
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is a column-strict plane partition of shape (4,3,1). Then the coefficient of 
xix} sag? ins, is the number of column-strict plane partitions of shape A con- 
taining 7; entries equal to i. 
The weight of a plane partition is the sum of its entries. If we set x; =q' in 
5,, we get the generating function by weight for column-strict plane partitions of 
shape A. There is a very nice cxplicit formula for this generating function, which 
can be stated most elegantly in terms of the hook lengths of 4. We define the hook 
length of a cell in a Ferrers diagram to be the number of cells to its right plus the 


number of cells below it plus one. Thus the hook lengths for the partition (4,3, 1) 
are 


Theorem 14.2. The generating function by weight for column-strict plane partitions 
of shape d is 


1 
N(A) aes 
q I 1 qho”’ 


where the product is over all cells c of the Ferrers diagram of 4, h(c) is the hook 
length of c, and N(A) = 9G; iAi. 


For the proof of this theorem, and other results on plane partitions, see Stanley 
(1971) or Macdonald (1979). 

One of the most famous theorems of enumerative combinatorics is the theorem 
of Polya (1937) on counting orbits under a group action. (See also Pélya and Read 
1987.) Pélya’s theorem can be stated in several different ways, but one of the most 
useful is in terms of symmetric functions. 

Suppose that a finite group G acts on a finite set A. Then G also acts on functions 
f : A->N, where N is the set of positive integers: for g € G and f : A->N, we 
define g - f by (g- f)(a) = f(g~! - a). We define the weight of a function f : A> N 
to be the monomial [J] ,,.4 X/(q)- It is clear that two functions in the same orbit of 
G have the same weight, so we may define the weight of an orbit to be the weight 
of any of its elements. Pélya’s theorem gives a formula for the sum of the weights 
of all orbits of functions. We may think of a function A — N as a “coloring” of the 
elements of A, so Pélya’s theorem enables us to count colorings which are distinct 
with respect to the action of G. 


Polya’s theorem is a consequence of an elementary result in group theory, often 
called Burnside’s lemma: 


Lemma 14.3. Suppose that a finite group acts on a weighted set X, and that weights 
are constant on orbits. Define the weight of an orbit to be the weight of any of its 
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elements. For each g in G let ®(g) be the sum of the weights of the elements of X 
fixed by G. Then the sum of the weights of the orbits is 


a 32 HG). 


geG 


If G acts on a finite set A, then to each clement g of G we may associate a 
permutation a, of A by 7,(a) = g-a for a in A. We define the cycle index for the 
action of G on A to be the symmetric function 


1 ‘2 E 
Z(G) = je aot (14.4) 


gEG 


where j;,(g) is the number of k-cycles in the cycle decomposition of a,. We may 
now state Pdélya’s theorem: 


Theorem 14.5. The sum of the weights of the orbits of functions on A under the 
action of G is Z(G). 


Proof. It is not hard to see that a function f : A — N is fixed by g € G if and only 
if f is constant on each cycle of a,. Thus the sum of the weights of the functions 


fixed by g is pre pe®) --+. Then the theorem follows by applying Lemma 14.3 to 
the action of G on the set X of functions from A toN. O 


One of the simplest applications of Pélya’s theorem is to counting equivalence 
classes of words under the relation of conjugacy introduced in section 5. If we 
take A to be the set {n], then the functions A — N may be identified with words 
of length n in N*. Let G be the cyclic group C,, acting in the usual way on [n]. 
Then two words are in the same orbit under the action of C,, if and only if they 
are conjugates. To evaluate the cycle index of G, let g be a generator for C,. Then 
ayn has d cycles, each of length n/d, where d is the greatest common divisor of m 
and n. There are ¢(n/d) values of m corresponding to each divisor d of n, where 
¢ is Euler’s totient function, and thus 


Z(Cn) =~ 6(n/d)phyg (146) 


d\n 


In particular, the number of equivalence classes under conjugation of words in 
{k]" is obtained by setting x; = x. =--- =x, =1, x; = 0 for i > k, in (14.6), so that 
pi = k, and (14.6) becomes n7' gi, 6 (n/d)k4. 

For a more complicated example, we count isomorphism classes of graphs on 1 
vertices. We start with the action of the symmetric group &, on [n]. This action 
yields in a natural way an action on the set A of unordered pairs of distinct elements 
of [n], which are the edges of the complete graph K,, on {n]. Then a function from 
A to N may be thought of as a coloring of the edges of K,,. There is a bijection 
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between 2-colorings of edges of K, and all graphs on [n]: given a graph G on [n}, 
we assign an edge of K,, color 1 if it is in G and color 2 if it is not in G. Two 
graphs are isomorphic if and only if their corresponding 2-colorings of K, are in 
the same orbit. Thus to count isomorphism classes of graphs we need only find 
the cycle index for this action of ¥,,, then substitute x, = x, = 1; x; =0 for i >2, 
which gives p; = 2 for all i. 

We shall show that the cycle index is 


k~1 kmy, AK) pecasimimy 
> ™m,! sage “Tee's ‘ake ‘LD pate ' ‘TDP. “LDP temii,j) a 


i<j 


(14.7) 


where the sum is over all 772), 72, ... satisfying 71 + 272 + --- =n, and Icm and gcd 
denote the least common multiple and greatest common divisor. To see this, we 
first observe that the cycle type of 7, for g in Y, depends only on the cycle type of 
g. The number of permutations in ¥, with m; cycles of length i for each i, where 


XX; im; =n is 


nt! 
1 my! 2" m2!) --- 


For such a permutation g we must determine the cycle type of 7, the permutation 
on pairs induced by g. 

First we consider pairs in which both elements lie in the same cycle of g. It turns 
out that we must consider separately cycles of even length and of odd length. In a 
cycle of g of even length 2k, the pairs {a, g“(a)} constitute a single cycle of length 
k; all the other pairs lie in cycles of length a and there are k — 1 of these cycles. 
Thus this cycle of g contributes a factor p, Ph, i to the product in ie 7); since there 
are altogether m, cycles of this length, their contribution is (p,; Pe ‘ym For cycles 
of g of odd length 2k +1, every pair is in a cycle of a, of length 2k + 1, and there 
are k of these, yielding the second product in (14.7). 

Next we consider pairs in which the two elements lie in different cycles of g. 
First suppose that a and B lie in two different cycles of g of the same length k. 
Then {a,B} ts in a cycle of length k of a,. The pairs obtained from these two 
cycles of g will constitute k cycles of 7, and there are (") ways to choose two 
cycles of g of length k. This explains the third product in (14.7). Finally, the last 
product in (14.7) corresponds to the case of a pair of elements from two cycles of 
g of lengths i and j, with i < j. Each pair will lie in a cycle of a of length Icm(é, j). 
The pairs obtained from these two cycles of g will constitute gcd(i, /) cycles of 7g, 
and there are m;m; ways to choose two cycles of g of these lengths. 

Although (14.7) looks complicated, it is actually useful in computing the number 
of unlabeled graphs on n vertices, as long as n is not too large. For a comprehensive 
account of applications of Pélya’s theorem to graphical enumeration, see Harary 
and Palmer (1973). 
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1. Introduction 


Asymptotic enumeration methods provide quantitative information about the rate 
of growth of functions that. count combinatorial objects. Typical questions that 
these methods answer are: (1) How does the number of partitions of a set of 
n elements grow with n? (2) How does this number compare to the number of 
permutations of that set? 

There do exist enumeration results that leave nothing to be desired. For exam- 
ple, if a, denotes the number of subsets of a set with n elements, then we trivially 
have a, = 2". This answer is compact and explicit, and yiclds information about all 
aspects of this function. For example, congruence properties of a, reduce to well- 
studied number-theory questions. (This is not to say that all such questions have 
been answered, though!) The formula a, = 2” also provides complete quantitative 
information about a,. It is easy to compute for any value of n, its behavior is about 
as simple as possible, and it holds uniformly for all 2. However, such examples are 
extremely rare. Usually, even when there is a formula for the function we are 
interested in, it is a complicated one, involving summations or recurrences. The 
purpose of asymptotic methods is to provide simple explicit formulas that describe 
the behavior of a sequence for large values of indices. There is no satisfactory def- 
inition of what is meant by “simple” or by “explicit”. However, we can illustrate 
this concept by some examples. The number of permutations of n letters is given 
by b, = n!. This is a compact notation, but only in the sense that factorials are so 
widely used that they have a special symbol. The symbol n! stands for n- (n — 1)- 
(n — 2)-...-2-1, and it is the latter formula that has to be used to answer ques- 
tions about the number of permutations. If one is after arithmetic information, 
such as the highest power of 7, say, that divides n!, one can obtain it from the 
product formula, but even then some work has to be done. For most quantitative 
purposes, however, n! = n-(n —1)-...-2-1 is inadequate. Since this formula is a 
product of n terms, most of them large, it is clear that 1! grows rapidly, but it is not 
obvious just how rapidly. Since all but the last term are > 2, we have n! > 2""', 
and since all but the last two terms are > 3, we have n! > 3"-?, and so on. On the 
other hand, each term is <n, so n! <n". Better bounds can clearly be obtained 
with greater care. The question such estimates raise is: just how far can one go? 
Can one obtain an estimate for n! that is easy to understand, compute, and ma- 
nipulate? One answer provided by asymptotic methods is Stirling’s formula: n! is 
asymptotic to (21n)'/2(n/e)" as n > co, which means that the fimit as n — oo of 
ni(2an)~1/2(n/ e)-" exists and equals 1. This formula is concise and gives a useful 
representation of the growth rate of n!. It shows, for example, that for n large, 
the number of permutations on n letters is considerably larger than the number of 
subsets of a set with | $n log n| elements. 

Another simple example of an asymptotic estimate occurs in the “probléme des 
rencontres” (Comtet 1974). The number d, of derangements of n letters, which ts 
the number of ways of handing back hats to n people so that no person receives 
his or her own hat, is given by 


e 


oO 
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a n' 
dn = YO(-G (1.1) 
k=0 


This is a nice formula, yet to compute d,, exactly with it requires substantial effort, 
since the summands are large, and at first glance it is not obvious how large d,, is. 
However, we can obtain from (1.1) the asymptotic estimate 
d 
= e! asn—-oo. (1.2) 
n! 
To prove (1.2), we factor out n! from the sum in (1.1). We are then left with a sum 
of rapidly decreasing terms that make up the initial segment of the series 


lei 
> a ’ 


and (1.2) follows easily. It can even be shown that d,, is the nearest integer to e~'n! 
for all n > 1, see (Comtet 1974). The estimate (1.2) does not allow us to compute 
d,, but combined with the estimate for n! cited above it shows that d,, grows like 
(20n)'/2n" e~"-!, Further, (1.2) shows that the fraction of all ways of handing out 
hats that results in every person receiving somebody else’s hat is approximately 
1/e. Results of this type are often exactly what is desired. 

Asymptotic estimates usually provide information only about the behavior of a 
function as the arguments get large. For example, the estimate for n! cited above 
says only that the ratio of n! to (21n)'/?(n/e)" tends to 1 as n gets large, and 
says nothing about the behavior of this ratio for any specific value of n. There 
are much sharper and more precise bounds for n!, and they will be presented in 
section 3. However, it is generally true that the simpler the estimate, the weaker 
and less precise it is. There seems to be an unavoidable tradeoff between con- 
ciseness and precision. Just about the simplest formula that exactly expresses n! is 
n-(n-1)-...-2-1. (We have to be careful, since there is no generally accepted 
definition of simplicity, and in many situations it is better to use other exact formu- 
las for n!, such as the integral formula n! = f° 1" e~‘ dt for the I’-function. There 
are also methods for evaluating n! that are somewhat more efficient than the 
straightforward evaluation of the product.) Any other formula is likely to involve 
some loss of accuracy as a penalty for simplicity. 

Sometimes, the tradeoffs are clear. Let p(n) denote the number of partitions of 
an integer n. The Rademacher convergent series representation (Andrews 1976, 
Ayoub 1963) for p() is valid for any n > 1: 


p(n) = wi '2-*'? An(nym'?2£ (y=! sinh(Cm™'Ad,))| (1.3) 
du 2 v=n 


md 


where 


C = n(2/3)'7, Ay = (v ~ 1/24)! , (1.4) 
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and the A,,(n) satisfy 


A,(n) =1, A2(n)=(~-1)" “foralln>1, 
|Am(n)| <m, _ -forallm,n>1, 


and are easy to compute. Remarkably enough, the series (1.3) does yield the exact 
integer value of p(n) for every n, and it converges rapidly. (Although this is not 
directly relevant, we note that using this series to compute p(n) gives an algorithm 
for calculating p(n) that is close to optimal, since the number of bit operations is 
not much larger than the number of bits of p(n).) By taking more and more terms, 
we obtain better and better approximations. The first term in (1.3) shows that 


p(n) = ata et (a! sinh(Ca,))| _ +O(n-texp(Cn'/?/2)), (15) 


and if we do not like working with hyperbolic sines, we can derive from (1.5) the 
simpler (but less precise) estimate 


1400) our 
p(n) = aT, ° ’ (1.6) 


valid for all x > 1. Unfortunately, exact and rapidly convergent series such as (1.3) 
occur infrequently in enumeration, and in general we have to be content with 
poorer approximations. 

The advantage of allowing parameters to grow large is that in surprisingly many 
cases, even when there do exist explicit expressions for the functions we are in- 
terested in, this procedure does yield simple asymptotic approximations, when the 
influence of less important factors falls off. The resulting estimates can then be 
used to compare numbers of different kinds of objects, decide what are the most 
common objects in some category, and so on. Even in situations where bounds valid 
for all parameter values are needed, asymptotic estimates can be used to suggest 
what form those bounds should take. Usually the error terms in asymptotic esti- 
mates can be made explicit (although good bounds often require substantial work), 
and can be used together with computations of small values to obtain universal 
estimates. It is common that already for n not much larger than 10 (where n is the 
basic parameter) the asymptotic estimate is accurate to within a few percentage 
points, and for n > 100 it is accurate to within a fraction of a percentage point, 
even though known proofs do not guarantee results as good as this. Therefore the 
value of asymptotic estimates is much greater than if they just provided a picture 
of what happens at infinity. 

Under some conditions, asymptotic results can be used to prove completely 
uniform results. For example, if there were any planar maps that were not four- 


colorable, then almost every large planar map would not be four-colorable, as it — 


would contain one of those small pathological maps. Therefore if it could be proved 
that most large planar maps are four-colorable, we would obtain a new proof of 
the four-color theorem that would be more satisfactory to many people than the 
original one of Haken and Appel. Unfortunately, while this is an attractive idea, 
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no proof of the required asymptotic cstimate for the normal chromatic number of 
planar maps has been found so far. 

Asymptotic estimates are often useful in deciding whether an identity is true. If 
the growth rates of the two functions that are supposed to be equal are different, 
then the coincidence of initial values must be an accident. There are also more 
ingenious ways, such as that of Example 13.1, for deducing nonexistence of iden- 
tities in a wide class from asymptotic information. Sometimes asymptotics is used 
in a positive way, to suggest what identities might hold. 

Simplicity is an important advantage of asymptotic estimates. They are even 
more useful when no explicit formulas for the function being studied are available, 
and one has to deal with indirect relations. For example, let 7, be the number 
of rooted unlabeled trees with vertices, so that Tg = 0, 7; = T2 = 1, T; =2, 


Ts, =4,....No explicit formula for the 7,, is known. However, if 
fee} 
T(z) = > Tnz" (1.7) 
n=\ 


is the ordinary generating function of 7,,, then Cayley and Pélya showed that 


T(z) = zexp (>: rite) ; (1.8) 


k=1 


This functional equation can be derived using the generat Pélya—Redfield enumer- 
ation method, an approach that is sketched in section 15. Example 15.1 shows how 
analytic methods can be used to prove, starting with eq. (1.8), that 


Ta ~ Cron? asnow, (1.9) 
where 
C = 0.4399237..., r=0.3383219..., (1.10) 


are constants that can be computed efficiently to high precision. For 7 = 20, T,, = 
12, 826, 228, whereas Cr-7°20-7/2 = 1.274... x 10’, so asymptotic formula (1.9) is 
accurate to better than 1%. Thus this approximation is good enough for many 
applications. It can also be improved easily by adding lower-order terms. 

Asymptotic enumeration methods are a subfield of the huge area of general 
asymptotic analysis. The functions that occur in enumeration tend to be of re- 
stricted form (often nonnegative and of regular growth, for example) and therefore 
the repertoire of tools that are commonly used is much smaller than in general 
asymptotics. This makes it possible to attempt a concise survey of the most impor- 
tant techniques in asymptotic enumeration. The task is not easy, though, as there 
has been tremendous growth in recent years in combinatorial enumeration and the 
closely related field of asymptotic analysis of algorithms, and the sophistication of 
the tools that are commonly used has been increasing rapidly. 

In spite of its importance and growth, asymptotic enumeration has seldom been 
presented in combinatorial literature at a level other than that of a research paper. 
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There are several books that treat it (Bender and Williamson 1991, Comtet 1974, 
Graham et al. 1989, Greene and Knuth 1982, Knuth 1973a,b, 1981, Wilf 1990), but 
usually only briefly. The only comprehensive survey that is available is the excellent 
and widely quoted paper of Bender (1974). Unfortunately it is somewhat dated. 
Furthermore, the last two decades have also witnessed a flowering of asymptotic ~ 
analysis of algorithms, which was pioneered and popularized by Knuth. Combi- 
natorial enumeration and analysis of algorithms are closely related, in that both 
deal with counting of particular structures. The methods used in the two fields are 
almost the same, and there has been extensive cross-fertilization between them. 
The literature on theoretical computer science, especially on average-case analysis 
of algorithms, can therefore be used fruitfully in asymptotic enumeration. One no- 
table survey paper in that area is that of Vitter and Flajolet (1990). There are also 
presentations of relevant methods in the books (Greene and Knuth 1982, Hofri 
1987, Knuth 1973a,b, 1981, Kemp 1984). Section 18 is a guide to the literature on 
these topics. 

The aim of this chapter is to survey the most important tools of asymptotic 
enumeration, point out references for the results and methods that are discussed, 
and to mention additional relevant papers that have other techniques that might be 
useful. It is intended for a reader who has already used combinatorial, algebraic, or 
probabilistic methods to reduce a problem to that of estimating sums, coefficients 
of a generating function, integrals, or terms in a sequence satisfying some recursion. 
How such a reduction is to be accomplished will be dealt with sparingly, since it 
is a large subject that is already covered extensively in other chapters, especially 
chapter 21. We will usually assume that this task has been done, and will discuss 
only the derivation of asymptotic estimates. 

The emphasis in this chapter is on elementary and analytic approaches to asymp- 
totic problems, relying extensively on explicit-generating functions. There are other 
ways to solve some of the problems we sill dizcwce-and probabilistic methods in 
particular can often be used instead. We will only make some general remarks and 
give references to this approach in section 16. 

The only methods that will be discussed in detail are fully rigorous ones. There 
are also methods, mostly from classical applied mathematics (cf. Bender and 
Orszag 1978) that are powerful and often give estimates when other techniques 
fail. However, we do not treat them extensively (aside from some remarks in sec- 
tion 16.4) since many of them are not rigorous. 

Few proofs are included in this chapter. The stress is on presentation of basic 
methods, with discussions of their range of applicability, statements of general 
estimates derivable from them, and examples of their applications. There is some 
Tepetitiveness in that several functions, such as 7!, are estimated several times. The 
purpose of doing this is to show how different methods compare in their power and 
ease of use. No attempt is made to present derivations starting from first principles. 
Some of the examples are given with full details of the asymptotic analysis, to 
explain the basic methods. Other examples are barely more than statements of 
results with a brief explanation of the method of proof and a reference to where 
the proof can be found. The reader might go through this chapter, possibly in a 


cates 
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random order, looking for methods that might be applicable to a specific problem, 
or can look for a category of methods that might fit the problem and start by 
looking at the corresponding sections. 

There are no prerequisites for reading most of this chapter, other than acquain- 
tance with advanced calculus and elementary asymptotic estimates. Many of the 
results are presented so that they can be used in a cookbook fashion. However, 
many of the applications require knowledge of complex variables. 

Section 2 presents the basic notation used throughout the chapter. It is largely 
the standard one used in the literature, but it seemed worthwhile summarizing it 
in one place. Section 3 is devoted to a brief discussion of identities and related 
topics. While asymptotic methods are useful and powerful, they can often be either 
augmented or entirely replaced by identities, and this section points out how to 
use them. 

Section 4 summarizes the most important and most useful estimates in combi- 
natorial enumeration, namely those related to factorials and binomial coefficients. 
Section 5 is the first one to feature an in-depth discussion of methods. It deals with 
estimates of sums in terms of integrals, summation formulas, and the inclusion— 
exclusion principle. However, it does not present the most powerful ‘Tful tool for esti: 
mation of sums, namely generating functions. TI hese are introduced in section 6, 
which-presents-someé of the basic properties of, and tools for dealing with, gen- 
erating functions. While most generating functions that are used in combinatorial 
enumeration converge at least in some neighborhood of the origin, there are also 
many nonconvergent ones. Section 7 discusses some estimates that apply to all 
formal series, but are especially useful for nonconvergent ones. 

Section 8 is devoted to estimates for convergent power series that do not use: 
complex variables. While not as powerful as the analytic methods presented later, 
these techniques are easy to use and suffice in many applications. 

Section 9 presents a variety of techniques for determining the asymptotics of 
recurrence relations. Many of these methods are based on generating functions, 
and some use analytic methods that aré“discussed later in the chapter. They are 
presented at this point because they are basic to combinatorial enumeration, and 
they also provide an excellent illustration of the power of generating functions. 

Section 10 is an introduction to the analytic methods for estimating generating 
functions. Many of the results mentioned here are common to all introductory 
complex analysis courses. However, there are also many, especially those in sec- 
tions 10.4 and 10.5, which are not as well known, and are of special value in 
asymptotics. 

Sections 11 and 12 present the main methods used in estimation of coefficients 
of analytic functions in a single variable. ‘The basic principle is that the singularities 
of the generating function that are closest to the origin determine the growth rate 
of the coefficients. If the function does not grow too fast as it approaches those 
singularities, the methods of section 11 are usually applicable, while if the growth 
rate is high, methods of section 12 are more appropriate. 

Sections 13-15 discuss extensions of the basic methods of sections 10-12 to 
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multivariate generating functions, ‘integral transforms, and problems that involve 
a combination of methods. ° 

Section 16 is a collection of miscellaneous methods and results that did not 
easily fit into any other section, yet are important in asymptotic enumeration. 
Section 17 discusses the extent to which computer algebra systems can be used to 
derive asymptotic information. Finally, section 18 is a guide to further reading on 
asymptotics, since this chapter does not provide complete coverage of the topic. 


2. Notation 


The symbols O, 0, and ~ will have the usual meaning throughout this paper: 


f(z) =O(g(z)) as z-+w means f(z)/g(z) is bounded as z > w; 
f(z) =0(g(z)) asz—w means f(z)/g(z) > 0 asz—w; 
f(z) ~g(z) asz—w means f(z)/g(z) - 1 asz—>w. 


When an asymptotic relation is stated for an integer variable n instead of z, it 
will implicitly be taken to apply only for integer values of n — w, and then we 
will always have w = oo or w = —oo. An introduction to the use of this nota- 
tion can be found in Graham et al. (1989). Only a slight acquaintance with it is 
assumed: enough to see that (1 + O(n7'/*))" = exp(O(?9)) and log(a ae tal = 
log(n) + n7/2 — (2n)-! + O(n-3/?). 

The notation x — w~ for real w means that x tends to w only through values 
x<w. 

Some asymptotic estimates refer to uniform convergence. As an example, the 
statement that f(z) ~ (1 — z)~? as z > 1 uniformly in |Arg(1 — z)| < 27/3 means 
that for every € > Q, there is a 5 < 0 such that 


f(z) —zY-I]<e 


for all z with 0 < [1 — z| < 4, |Arg(1 — z)| < 27/3. This is an important concept, 
since lack of uniform convergence is responsible for many failures of asymptotic 
methods to yield useful results. 

Generating functions will usually be written in the form 


fl2= ofr", (2.1) 


and we will use the notation (z"[f(z) for the coefficient of z" in f(z), so that 
if f(z) is defined by (2.1), [z"]f(z) = fy. For multivariate generating functions, 
[x"y"| f(x, y) will denote the coefficient of x’"y", and so on. If a, denotes a sequence 
whose asymptotic behavior is to be studied, then in combinatorial enumeration 
one usually uses either the ordinary generating function f(z) defined by (2.1) with 
fa, = Gq, OF else the exponeniial generating function f(z) defined by (2.1) with f, = 
an/n'. In this chapter we will not be concerned with the question of which type 
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of generating function is best in a given context, but will assume that a gencrating 
function is given, and will concentrate on methods of extracting information about 
the coefficients from the form we have. 

Asymptotic series, as defined by Poincaré, are written as 


aw oan Ko, (2.2) 
k=0 


and mean that for every K > 0, 


K 

fn = Yan * +O *) asn—oo. (2.3) 

k=0 

The constant implied by the O-notation may depend on K. It is unfortunate that the 
same symbol is used to denote an asymptotic series and an asymptotic relation, as 
defined in the first paragraph of this section. Confusion should be minimal, though, 
since asymptotic relations will always be written with an explicit statement of the 
limit of the argument. 

The notation f(z) ~ g(z) will be used to indicate that f(z) and g(z) are in some 
vague sense close together. It is used in this chapter only in cases where a precise 
statement would be cumbersome and would not help in explaining the essence of 
the argument. 

All logarithms will be natural ones to base e unless specified otherwise, so that 
log 8 = 2.0794... , log, 8 =3. The symbol |x] denotes the greatest integer <x. 
The notation x — 1” means that x tends to I, but only from the left, and similarly, 
x —» 0* means that x tends to 0 only from the right, through positive values. 


3. Identities, indefinite summations, and related approaches 


Asymptotic estimates are useful, but often they can be avoided by using other 
methods. For example, the asymptotic methods presented later yield estimates for 
(j)2* as k and n vary, which can be used to estimate accurately the sum of (2k 
for n fixed and k running over the full range from 0 to n. That is a general and 
effective process, but somewhat cumbersome. On the other hand, by the binomial 


theorem, 
n 
> (i) = (14+2)"=3". (3.1) 
k=0 

This is much more satisfactory and simpler to derive than what could be obtained 
from applying asymptotic methods to estimate individual terms in the sum. How- 
ever, such identities are seldom available. There is nothing similar that can be 
applied to 


> (Qo (3.2) 


kgn/5 


ee er A A 
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and we arc forced to usc asymptotic methods to estimate this sum. 

Recognizing when some combinatorial identity might apply is not easy. The 
literature on this subject is huge, and some of the references for itare Gould (1972), 
Gradshteyn and Ryzhik (1965), Hansen (1975), Jolley (1961), and Riordan (1968). 
Many of the books listed in the references are useful for this purpose. Generating 
functions (see section 6) are one of the most common and powerful tools for 

proving identities. Here we only mention two recent developments that arc of 
significance for both theoretical and practical reasons. One is Gosper’s algorithm 
for indefinite hypergeometric summation (Gosper 1978, Graham et al. 1989). Given 


a sequence aj, a2,..., Gosper's algorithm determines whether the sequence of 
partial sums 
n 
ba) tgs WEA,2, 22: (3.3) 
k=1 


has the property that b,/b,_; is a rational function of n, and if it is, it gives an 
explicit form for b,. We note that if b,/b,_1 is a rational function of n,‘then so is 
an by / bye -1 

— = ——— . 3.4 

Ani 1— bn. 2/On-1 ( ) 
Therefore Gosper’s algorithm should be applied only when a,,/a,_; is rational. 
The othér recent development is the Wilf-Zeilberger method for proving com- 
binatorial identities (Wilf and Zeilberger 1990, 1992). Given a conjectured identity, 
it provides an algorithmic procedure for verifying it. This method succeeds in a 
surprisingly wide range of cases. Typically, to prove an identity of the form 


So U(n,k) =S(n), 120, (3.5) 
k 


where S(n) 4 0, Wilf and Zeilberger define F(n, k) = U(n, k)/S(n) and search for 
a rational function R(n,k) such that if G(n,k) = R(n,k)F(n,k — 1), then 


F(n+1,k) — F(n,k) = G(n,k +1) - G(n,k) (3.6) 
holds for all integers 1,k with n > 0, and such that 
(i) for each integer k, the limit 
fi = fim F(n,k) (3.7) 


exists and is finite; 

(ii) for each integer 1 > 0, lim, ., 4... G(n,k) = 0; 

(iti) Limy oo Hen G(n, k) = 0. 
If all these conditions are satisfied, and eq. (3.5) holds for n = 0, then it holds for 
alln > 0. 


Example 3.1 (Dixon's binomial-sum identity). This identity states that 


DN leet) es) a ete (3.8) 
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This can be proved by the Wilf-Zeilberger method by taking 


(b+1-k)(c+1—k) 
2(n+ Bn +b +e 1) (9) 


and verifying that the conditions above hold. & 


R(n,k) = 


The Wilf-Zeilberger method requires finding a rational function R(n,k) that 
satisfies the properties listed above. This is often hard to do, especially by hand. 
Gosper’s algorithm leads to a systematic procedure for constructing such R(n, k). 

To conclude this section, we mention that useful resources when investigating 
sequences arising in combinatorial settings are the books of Sloane (1973) and 
Sloane and Plouffe (1995), which lists several thousand sequences and gives ref- 
erences for them. Section 17 mentions some software systems that are useful in 
asymptotics. 


4. Basic estimates: factorials and binomial coefficients 


No functions in combinatorial enumeration are as ubiquitous and important as the 
factorials and the binomial coefficients. In this section we state some estimates for 
these quantities, which will be used throughout this chapter and are of widespread 
applicability. Several different proofs of some of these estimates will be sketched 
later. 

The basic estimate, from which many others follow, is that for the factorial. As 
was mentioned in the introduction, the basic form of Stirling’s formula is 


ni (2an)'?n"e" asn— oo.” (4.4) 


This is sufficient for many enumeration problems. However, when necessary one 
can draw on much more accurate estimates. For example eq. 6.1.38 in NBS (1970) 
gives 

nt = (2mn)'n" exp(—n + 6 /(12n)) (4.2) 


for all n > 1, where 6 = 0(n) satisfies 0 < @ < 1. More generally, there is Stirling’s 
asymptotic expansion: : 


1 1 


log{n}(2an)'/?n-* e"} ~ Da 3 (4.3) 


[This is an asymptotic series in the sense of eq. (2.2), and there is no convergent 
expansion for log{n!(21n)~'/2n-" e”} as a power series in n~'.] Further terms in 
the expansion (4.3) can be obtained, and they involve Bernoulli numbers. In most 
references, such as eq. 6.1.37 or 6.1.40 of NBS (1970), Stirling’s formula is presented 
for (x), where Fis Euler’s gamma function. Expansions for I(x) translate readily 
into ones for n! because n! = [(n +1). 

Stirling’s approximation yields the expansion 


2n 4" i. 4 5 “ 
CG) (an)? {1 in * 128n3 * foams * OM i} 4) 
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A less precise but still useful estimate is 


ban) (2)" 2, eee | | (45) 


This estimate is used frequently. The binomial coefficients are symmetric, so that 
G) = (,",) and unimodal, so that for a fixed n and k varying, the ({) increase 
monotonically up to a peak at k = {n/2{ (which is unique for n even and has two 
equal high points at k = (n + 1)/2 for n odd) and then decrease. 

More important than eq. (4.5) are expansions for general binomial coefficients. 
Equation (4.2) shows that for 1 << k <a—1, 


(i) _ nt 
k} kin —ky)! 
- ie —® i k(n ee oF (0 (i “hn =e) 


={aatcy} ow (nw (=) +o(t + :)) : | (4.6) 


H(x) = —xlagx — (1 — x) log(l ~— x) (4.7) 


where 


is the entropy function. (We set H(0) = H(1) = 0 to make H(x) continuous for 
0 <x <1.) Simplifying further, we obtain 


a = exp(nH(k/n) + O(logn)) , (4.8) 


an estimate that is valid for all 0 < & <n. In many situations it suffices to use the 
weaker but simpler bound 


Gy <(E) , O<kSn. (4.9) 


Approximations of this form are used frequently in information theory and other 
fields. 


A general estimate that can be derived by totally elementary methods, without 
recourse to Stitling’s formula, is 


-1 
(7) (hs) = exp(-2(k ~n/2)'/n + O(\k —n/2P/n?)) , (4-10) 


valid for {k -- n/2| < n/4, say. It is most useful for |k — a/2| = o(n/), since the 
error term is small then. Similarly, 


ae a (4) Bes (4.11) 
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uniformly in k provided r (which may be negative) satisfies r2 = o(k) and r? = 
o(n - k). Further, we have. 


(n+k)! ~n* exp(k?/(2n))n! asn— oo, (4.12) 


again uniformly in k provided k = o(n’/>). 


5. Estimates of sums and other basic techniques 


When encountering a combinatorial sum, the first reaction should always be to 
check whether it can be simplified by use of some identity. If no identity for the 
sum is found, the next step should be to try to transform the problem to eliminate 
the sum. Usually we are interested not in single isolated sums, but parametrized 
families of them, such as 


by = So an(k) , (5.1) 
k 


and it is the asymptotic behavior of the b, as n — oo that is desired. A standard and 
well-known technique (named the “snake-oil” method by Wilf 1990) for handling 
such cases is to form a generating function f(z) for the b,, use the properties of 
the a,(k) to obtain a simple form for f(z), and then obtain the asymptotics of the 
bn from the properties of f(z). This method will be presented briefly in section 6. 
In this section we discuss what to do if those two approaches fail. Sometimes the 
methods to be discussed can also be used in a preliminary phase to obtain a rough 
estimate for the sum. This estimate can then be used to decide which identities 
might be true, or what generating functions to form. 

There are general methods for dealing with sums (cf. Knopp 1971), many of 
which are used in asymptotic enumeration. A basic technique of this type is sum- 
mation by parts. Often sums to be evaluated can be expressed as 


n co 
) ajb; or ) a;b; 5 
j=l j=l 


where the b,, say, are known explicitly or behave smoothly, while the a; by them- 
selves might not be known well, but the asymptotics of 


k 
A(k) = Soa; (5.2) 
j=l 


are known. Summation by parts relies on the identity 


n n-1 
So ajbj = $5 ACK) (be — Best) + A(M)bn (5.3) 


j=l k=l 
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Example 5.1 (Sum of primes). Let- 
Sn = Yop ’ : (5.4) 
psn 


where p runs over the primes <n. The Prime Number Theorem (Ayoub 1963) 
states that the function 


ax)= So _ (55) 
D&x 
satisfies 
a(x) ~ fogs as x 00. (5.6) 


(More precise estimates are available, but we will not use them.) We rewrite 


Sn = >- a;b; , (5.7) 
j=l } 
/ 
where / 
1 jisp ifne ; 
— 5.8 
“i {0 othérwise , G4) 
and b; = j for ail j. Then A(k) = 7(k) and summation by parts yields 
n—1 
Sn = D> —a(k) + m(n)n . (5.9) 
k=1 
Since 
n-1 n-1 k rn 
Dom) do og” Bion asn—oo, (5.10) 
k=l k=2 
we have 
eee 5 
”"~ Fiogn asn—oo. (5.11) 
g 


Summation by parts is used most commonly in situations like those of Exam- 
ple 5.1, to obtain an estimate for one sum from that of another. 

Summation by parts is often easiest to carry out, both conceptually and nota- 
tionally, by using integrals. If we let 


A(x) = So ay , (5.12) - 


kx 
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then A(x) = A(n) forn < x <n+1. Suppose that b, = b(k) for some continuously 
differentiable function b(x). Then 


+1 
be Paa=— ff Ba)dr, (5.13) 


and we can rewrite eq. (5.3) as 
Lid n 
dab) = Ame(n) — [AbD A (5.14) 
j=l : 


[One can apply similar formulas even when the b; are not smooth, but this usually 
requires Riemann-Stieltjes integrals, cf. Apostol (1957).}] The approximation of 


sums by integrals that appears in (5.14) is common, and will be treated at length 
later. 


3.1. Sums of positive terms 


Sums of positive terms are extremely common. They can usually be handled with 
only a few basic tools. We devote substantial space to this topic because it is 
important and because the simplicity of the methods helps in illustrating some of 
the basic principles of asymptotic estimation, such as approximation by integrals, 
neglecting unimportant terms, and uniform convergence. For readers not familiar 
with asymptotic methods, working through the examples of this section is a good 
exercise that will make it easier to learn other techniques later. 
Typical sums are of the form 


bn = So an(k), an(k) 20, (5.15) 
k 


where k runs over some range of summation, often 0 << k cn or 0<k < 0, and 
the a,(k) may be given either explicitly or only through an asymptotic approxi- 
mation. What is desired is the asymptotic behavior of b, as n — oo. Usually the 
a,(k) for n fixed are unimodal, so that either (i) a,(k) < a,(k +1) for all & in 
the range, or (ii) aa({k) > an(k +1) for all k, or (ili) an(k) < an(k +1) for k < ko, 
and a,(k) > an(k +1) for k > kp. The single most important task in estimating b,, 
is usually to find the maximal a,(k). This can be done either by combinatorial 
means (involving knowledge of where the a,(k) come from), by asymptotic es- 
timation of the a,(k), Or (most common when the a,(k) are expressed in terms 
of factorials or binomial coefficients) by finding where the ratio a,(k + 1)/an(k) 
is close to 1. If an(kK + 1)/an(k) < 1 for all k, then we are in case (ii) above, and 
if an(k +1)/a,(k) > 1 for all k, we are in case (i). If there is a kg in the range 
of summation such that a,(kg + 1) is close to a,(kg), then we are almost certainly 
in case (iii) and the peak occurs at some & close to kg. The different cases are 
illustrated in the examples presented later in this section. 

Once max a,(k) = an(ko) has been found, the next task is to show that most of 


crest oa met re nn on net 
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the terms in the sum are insignificant. For example, if the sum in eq. (5.15) is over 
0<k <n, and if a,(0) = 1 is the largest term, then 


ys a,(k) <n}, 


k=O 
an k)<a~? 


which is negligible if we are only after a rough approximation to b,, say of thé 
form by ~ Cy as n — 00, or even by = Cn(1 + O(n™")) as n > 00. Once the small 
terms have been discarded, we are usually left with a short range of summation. It | 
can happen that this range is extremely short, and the maximal term a,,(kg) is much 
larger than any of its neighbors to the extent that b, ~ a,(kg) as n — oo. More 
commonly, the number of terms that contribute significantly to b, does grow as 
n -» oo, but slowly. Their contribution, relative to that of the maximal term a,(ko),” 
can usually be estimated by some simple function of k ~ ko, and the sum of all 
of them approximated by an explicit integral. This method is sometimes referred 
to as Laplace’s method for sums (in analogy to Laplace’s method for estimating , 
integrals, mentioned in section 5.5, which proceeds in a similar spirit). There is 
extensive discussion of this method in de Bruijn (1958). 


Example 5.2 (Sums of the partition function). We estimate 


Un= Spy, (5.16) 


k=1 


where p(k) is the number of partitions of k. Since any partition of m — 1, say one 
with c; parts of size j, can be transformed into a partition of m with c, +1 parts of 
size 1, and c, of size j for j > 2, we have p(m) > p(m — 1) for all m > 2. Therefore 
the largest term in the sum in (5.16) is the one with k = n. If the only estimate for. 

p(k) that we have is the one given by (1.6), then : 


p(n)" = exp(Cn*/? — nlog(4-3'/?n) + O(n'”)) . (5.17) 


Since the constant implied by the O-symbol is not specified, this estimate is poten- ° 
tially larger than p(n)" by a factor of exp(cn'/2), so we can only obtain asymptotics 
of log p(n)", not of p(n)" itself. This also means that rough estimates of U,, follow 
easily from (5.17). Since p(k)* < p(n)" for all k <n, and there are n terms in the 


sum, we have p(n)” < U, < np(n)”, and because of the large error term in (5.17), 
we obtain 


U, = exp(Cn?/? — nlog(4 -3!/2n) + O(n'/”)) . (5.18) 


Thus the use of the poor estimate (1.6) for p(m) means that we can obtain only a 
crude estimate for U,, and there is no need for careful analysis. 

Instead of (1.6) we can use the more refined estimate (1.5). Let q,, denote the 
first term on the right side of (1.5). Then we have 


P(R) = Gn + O(n! exp(Cn'”? /2)) = ga(1 + O(exp(—Cn'/?/2))) , (5.19) 
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p(n)" = gh(1 + O(nexp(—Cn!/?/2))) = qh(1 + O(exp(—Cn'/?/3))) , 
20) 


say. Also, for some ¢ > 0 we find from eq. (1.5) [or eq. (1.6)] that for large n 
Qn-1 <n — en Gy, . 
Thus for large n, 
gry <qn '(—en Py 
< qn exp(—en'/?/2) , 


and therefore 
Sri < (n — 1)p(n —1)""' < qh exp(—en'/?/3) . 


Thus we obtain 


Un = qh(1 + O(exp(—6n'/?))) (5.21) 


for some 6 > 0. 

The estimates of U, presented above relied on the observation that the last term 
in the sum (5.16) defining U,, is much larger than the sum of all the other terms. 
This does not happen often. A more typical example is presented by 


Tn = )_ p(k) « (5.22) 


As was noted before, p(n) is larger than any of the other terms, but not by enough 


to dominate the sum. We therefore try the other approaches that were listed at the: 


beginning of this section. We use only the estimate (1.6). Since (1 — x)? < 1—x/2 
for 0 <x < 1, we find that for large n, 


kent p(k) < np(n _ {n?/3}) 

< exp(C(n — [n??})"?) 

< exp(Cn!/2 — Cn'/6/2) 

= O(p(n) exp(—Cn!//3)) . 
Thus most of the values of k contribute a negligible amount to the sum. For 
k=n—j,0<j <n’, we find that 


p(n — j)/p(n) = (1+ O(n") exp(C(n — j)!? — Cn’) . 


(5.23) 
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Since . ; 
(n= p)'? = nl? — jn? [24 O(Pn?) , 


p(n — j)/p(n) = exp(—Cjn™? /2 + O(n“) (5.24) 
= (1+ O(n") exp(—Cjn-/2/2) . 


Thus the ratios p(n — j)/p(n) decrease geometrically, and so 


(1+ O(n-'/*)) 
1 — exp(—Cn-“!/2/2) 


pin" 3) p(n~j)= = 20721 + O19) . 


0<j<wh 
(5.25) 
Therefore, combining all the estimates, 
n 
1+ O(n-!/) Cri? 
Tn = >. p(k) = eae (5.26) 
k=1 


The O(n-'/*) error term above can easily be improved with a little more care to 
O(n-'/?), even if we continue to rely only on-(1.6). o 


Before presenting further examples, we discuss some of the problems that can 
arise even in the simple setting of estimating positive sums. We then introduce the 
basic technique of approximating sums by integrals. 

The lack of uniform convergence is a frequent cause of incorrect estimates. If 
an(k) ~ cn(k) for each k as n — oo, it does not necessarily follow that. 


= So an(k) ~ So enlk) asm— oo. (5.27) 
k k 


A simple counterexample is given by a,,(k) = (7) and c,(k) = (;)(1 + k/n). To con- 
clude that (5.27) holds, it is usually necessary to know that a,(k) ~ c,(k) asm — 00 
uniformly in k. Such uniform convergence does hold if we replace c,(k) in the 
counterexample above by c/,(k) = ({)(1 + k/n?), for example. 

There is a general principle that sums of terms that vary smoothly with the index 
of summation should be replaced by integrals, so that for a > 0, say, 


n nt 
yen | urdu asn—oo. (5.28) 
k=l : 


The advantage of replacing a sum by an integral is that integrals are usually much 
easier to handle. Many more closed-form expressions are available for definite 
and indefinite integrals than for sums. We will discuss extensions of this princi- 
ple of replacing sums by integrals further in section 5.3, when we present the 
Euler—Maclaurin summation formula. Usually, though, we do not need anything 
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sophisticated, and the application of the principle to situations like that of (5.28) 
is easy to justify. If a, = g(m) for some function g(x) of a real argument x, then 


n+l 
g(n) — [ g(u) du 


< max Bee) (5.29) 


and so 


max |g(u) — g(n)I , (5.30) 


ucn+l 


Yat) - f ewan < 


where the integral is over [a,b + i if the sum is overagn<b,a,be Z If g(u) 


is continuously differentiable, then |g(u) — g(n)| < ee le'(v)| forn <u 
n+1. This gives the estimate 


Sr a(n) - ie g(u) du 


n=a 


b 
<>. max, [s'(v)| - (5.31) 


Often one can find a simple explicit function h(w) such that |g’(v)| < h(w) for any 
v and w with |v — w| < 1, in which case eq. (5.31) can be replaced by 


b b+t 
date) [ giuyau 


For good estimates to be obtained from integral approximations to sums, it is 
usually necessary for individual terms to be small compared to the sum. 


< [ aes (5.32) 


Example 5.3 (Sum of exp(—ak?)). In the final stages of an asymptotic approxima- 
tion one often encounters sums of the form 


h(a) = 3 exp(—ak?),  a>0. (5.33) 
k=~—0oo 


There is no closed form for the indefinite integral of exp(—au’) (it is expressible 
in terms of the Gaussian error function only), but there is the famous evaluation 
of the definite integral 


- exp(—au’) du = (n/a)? (5.34) 


Thus it is natural to approximate h(a) by (1/a)'/?. If g(u) = exp(—au’), then 
g'(u) = —2aug(u), and so for n > 0, 


max | lg'(v)| < 2a(n + 1)g(n) . (5.35) 


NQUS 


For the integral in eq. (5.30) to yield a good approximation to the sum we must 
show that the error term is smaller than the integral. The largest term in the sum 
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occurs at n = 0 and equals 1. The error bound (5.35) that comes from approxi- 
mating g(0) = 1 by the integral of g(u) over 0 < u < 1 is 2a. Therefore we cannot 
expect to obtain a good estimate unless a — 0. We find that 


2a(n + 1)g(n) < 4aug(u/2) forn>1, n<ucn4+l, 
so (integral approximation again!) 


Yr 2a(n + 1)g(n) < 4a f° ug(u/2) du 


< 4a f° ug(u/2) du = (8a)'/? . (5.36) 


Therefore, taking into account the error for n = 0 which was not included in the - 
bound (5.36), we have 


CoO 


h(a) = » exp(—an?) = i: exp(—au’) du + O(a'/? + a) 


=(n/a)'? + O(a?) asa 0". (5.37) 


For this sum much more precise estimates are available, as will be shown in Ex- 
ample 5.9. For many purposes, though, (5.37) is sufficient. 


Example 5.3 showed how to use the basic tool of approximating.a sum by an ~ 
integral. Moreover, the estimate (5.37) that it provides is ubiquitous in asymptotic 
enumeration, since many approximations reduce to it. This is illustrated by the 
following example. 


Example 5.4 (Bell numbers (cf. de Bruijn 1958)). The Beil number, B(n), counts 
the partitions of an n-element set. It is given by (Comtet 1974): 


fo @] kn Z 
— el : 
Ban)=e'S> re (5.38) 
k=1 
In this sum no single term dominates. The ratio of the (kK + 1)st to the kth term is, 
(A+1)" kt 1 Ty" 
Gil Re Rp G2?) 


As k increases, this ratio strictly decreases. We search for the point where it is 
about 1. For k > 2, 


(1 + i) = exp (« log (1 + i)) = exp(n/k + O(n/k?)) , (5.40) 


so the ratio is close to 1 for n/k close to log(k + 1). We choose kg to be the closest 
integer to w, the solution to 


n=wlog(w+1). (5.41) 
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For k = ko + j, 1 <j < ko/2, we find, since log(1 + i/ko) = i/kp — i? /(2k2) + O(#/k3), 
kn ky (+ j/ko)" 
kK! Ko! KITV _,(1 + i/ko) 
kn 
= 57 oP (in /ko ~ log ko — 72(r + ko)/(2K3) + O(a? /ke + j/ko)) 


(5.42) 
The same estimate applies for —kp/2 < j < 0. The term jn/kg — jlog kp is small, 
since |ky — w| < 1/2 and w satisfies (5. 41). “We find 
= z: = 2 
n/ko — log kg = n/w — log(w + 1) + O(n/w* + 1/w) (5.43) 


= O(n/w?+1/w) . 


By (5.41), w ~ n/logn as n — oo. We now further restrict j to |j| < n'/? logn. Then 
(5.42) and (5.43) yield 


k" ks 
a RO? (=i? (+ ko) /(2K3) + O((logn)*n“"?)) (5.44) 
Approximating the sum by an integral, as in Example 5.3, shows that 
n kn 
a = FB ko(2m)" (0 + ko) V2(1 + O((logn)’n-"?)). (5.45) 
: 0: 


litsa™/2 toga 


(An easy way to obtain this is to apply the estimate of Example 5.3 to the sum 
from —co to oo, and show that the range |j| > n'/2 logn contributes little.) To esti- 
mate the contribution of the remaining summands, with |j| > n'/? logn, we observe 
that the ratio of successive terms is < 1, so the range 1 < k < kg — {n'/? logn| con- 
tributes at most kp (the number of terms) times the largest term, which arises for 
k = kg — |n'/7 logn]. By (5.44), this largest term is 


O(ko(ko!)~' exp(—(log n)*)) . 
For k > k, > ko + |n'/? log n|, we find that the ratio of the (k + 1)st to the kth term 


is, for large n, 


<4 (: +E) ~expla/hy —log(ky + 1) ~n/(2K}) + O(n /K3)) 
< exp(—(kr — ko)n/KE + O(n/K3)) (5.46) 
<exp(-2n'/) < 1-n'/?, 


and so the sum of these terms, for k; < k < 00, is bounded above by !/? times the 
term for k = k,. Therefore the estimate on the right-hand side of (5.45) applies 
even when we sum on all k, 1 < k < oo. 
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To obtain an estimate for B(n), it remains only to estimate kj /ko!. To do this, 
we apply Stirling’s formula and use the property that |kg — w| < 1/2 to deduce that 


B(n) ~ (logw)'/?2w"-"e"” asn— oo, (5.47) 


where w is given by (5.41). 


There is no explicit formula for w in terms of n, and substituting various asymp- 
totic approximations to w, such as 


n n 

w= ioe + O(a =) (5.48) 
(see Example 5.10) yields large error terms in (5.47), so for accuracy it is usually 
better to use (5.47) as it is. There are other approximations to B(n) in the literature 
(see, for example, Bender 1974, de Bruijn 1958). They differ slightly from (5.47) 

because they estimate B(n) in terms of roots of equations other than (5.41). 

Other methods of estimating B(n) are presented in Examples 12.9 and 12.13. 

¢ w 


5.2. Alternating sums and the principle of inclusion—exclusion 


At the beginning of section 5, the reader was advised in general to search for 
identities and transformations when dealing with general sums. This advice is even 
more important when dealing with sums of terms that have alternating or irregu- 
larly changing coefficients. Finding the largest term is of little help when there is 
substantial cancellation among terms. Several general approaches for dealing with 
this difficulty will be presented later. Generating-function methods for dealing with 
complicated sums are discussed in section 6. Contour-integration methods for al- 
ternating sums are mentioned in section 10.3. The summation formulas of the next 
section can sometimes be used to estimate sums with regularly varying coefficients 
as well. In this section we present some basic elementary techniques that are often 
sufficient. 

Sometimes it is possible to obtain estimates of sums with positive and negative 
summands by approximating separately the sums of the positive and of the negative 
summands. Methods of the preceding section or of the next section are useful in 
such situations. However, this approach is to be avoided as much as possible, 
because it often requires extremely precise estimates of the two sums to obtain 
even rough bounds on the desired sums. One method that often works and is much 
simpler consists of a simple pairing of adjacent positive and negative terms. 


Example 5.5 (Alternating sum of square roots). Let 


Sn = Sy(-1)ket? | (5.49) 


k=1 
We have 


1/2 
(2m)'/? — (2m — 1)'/? = (2m)'? ! sm (1 - ma) } 
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= (2m)'? {! - (1 ~ = 00)\ (5.50) 


= (8m)? + O(m-9) , 


so 
2[n/2} |\n/2] 
k=l al 
(5.51) 
=n'/?/2+O(1). 
Hence 
_ fn'?/2+00) ifn is even, 
Ser { ~n'/2/2+O(1) if nis odd. (5.52) 
& 


In Example 5.5, the sums of the positive terms and of the negative terms can 
easily be estimated accurately (for example, by using the Euler-Maclaurin formula 
of the next section) to obtain (5.52). In other cases, though, the cancellation is too 
extensive for such an approach to work. This is especially true for sums arising 
from the principle of inclusion—exclusion. 

Suppose that X is some set of objects and P is a set of properties. For RC P, 
let N_(R) be the number of objects in X that have exactly the properties in R 
and none of the properties in P \ R. We let Nz(R) denote the number of objects 
in X that have all the properties in R and possibly some of those in P \ R. The 
principle of inclusion—-exclusion says that 


N.(R) = S) (-1)@\"In5(Q) . (5.53) 

RCQCP 
[This is a basic version of the principle. For more general results, proofs, and 
references, see Comtet (1974), Goulden and Jackson (1983), and Stanley (1986).] 


Example 5.6 (Derangements of n letters). Let X be the set of permutations of n 
letters, and suppose that P;, 1 < i <n, is the property that the ith letter is fixed by 
a permutation, and P = {P\,..., Pa}. Then d,, the number of derangements of n 
letters, equals N..(0), where @ is the empty set, and so by (5.53) 

d, = )>(-1)!@!n,(Q) . (5.54) 


QcP 


However, N;(Q) is just the number of permutations that leave all letters specified 
by Q fixed, and thus 


a =So-nela ~ jQ\)! 


QcP 

a ; (5.55) 
= -via-oi(") =e, 

(i) = on 


which is eq. (1.1). Q 
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The formula (1.1) for derangements is easy to use because the terms decrease 
rapidly. Moreover, this formula is exceptionally simple, largely because N>(Q) 
depends only on |QJ. In general, the inclusion-exclusion principle produces com- . 
plicated sums that are hard to estimate. A frequently helpful tool is provided by the 
Bonferroni inequalities (Comtet 1974, Stanley 1986). One form of these inequalities 
is that for any integer m > 0, 


N(R) > 5~ (-1)9\*IN5(Q) (5.56) | 


Q 
RCOCP 
LQ\R<2m 


and 
N(R) < > (-1)9\*IN5(Q). (5.57) 
RCQCP 
IARI <2 


Thus in general 


|N-(R) - > (1215 (Q)|< SD N5(@). (5.58) 
RcOcP RCQCP 
IQ\Ri<k IQ\R\<a+l 


These inequalities are frequently applied for n = |X| increasing. Typically one 
chooses k that increases much more slowly than n, so that the individual terms 
Nz (Q) in (5.58) can be estimated asymptotically, as the interactions of the different 
properties counted by N;(Q) are not too complicated to estimate. Bender (1974) 
presents some useful general principles to be used in such estimates (especially 
the asymptotically Poisson distribution that tends to occur when the method is 
successful). We present an adaptation of an example from Bender (1974). 


Example 5.7 (Balls and cells). Given n labeled cells and m labeled balls, let 
a,(m,n) be the number of ways to place the balls into cells so that exactly h 
of the cells are empty. We consider h fixed. Let X be the ways of placing the balls 
into the cells (n™ in total), and P = {P,,...,P,}, where P; is the property that the 
ith cell is empty. If R = {P),..., P,}, then a,(m,n) = (7)N-(R). Now 


N>(Q) = (n - |QI)" , (5.59) 
so 
> Ns(Q) = Cyn h-0)™ 
Q 
jones 


= n™e—™h/n(y e—™/") (1-11 + O((t2 + Imn-? + (02 +: 1)n7!)) , 
(5.60) 


provided #2 <n and mi*n-? <1, say. In the range 0 <t < logn, nlogn<m< 
n?(logn)~>, we find that the fisht-hand side of (5.60) is 


ame ™l (ne ™™) 1) 1 + OUnn 2(logn)?)) . 
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We now apply (5.58) with k = [log], and obtain 


a,(m,n) = (,)N_(R) ~ (;)n™ exp(—mh/n — ne“™/") 
aru: - (5.61) 
~ nhl) \(neo"/")* exp(—ne7/") 
as m,n — oo, provided nlogn < m < n*(logn)~>. Since a,(m,n)n-™ is the proba- 
bility that there are exactly h empty cells, the relation (5.61) (which we have es- 
tablished only for fixed h) shows that this probability is asymptotically distributed 
like a Poisson random variable with parameter nexp(—m/n). 
Many additional results on random distributions of balls into cells, and references 
to the extensive literature on this subject can be found in Kolchin et al. (1978). 


Rg 


Bonferroni inequalities include other methods for estimating N_(R) by linear 
combinations of the N,(Q). Recent approaches and references (phrased in proba- 
bilistic terms) can be found in Galambos (1977). For bivariate Bonferroni inequal- 
' ities (where one asks for the probability that at least one of two sets of events 
occurs) see Galambos and Xu (1993) and Lee (1992). 

The Chen-Stein method (Chen 1975) is a powerful technique that is often used 
- in place of the principle of inclusion—exclusion, especially in probabilistic literature. 
Recent references are Arratia et al. (1990a) and Barbour et al. (1992). 


5.3. Euler-Maclaurin and Poisson summation formulas 


The introduction to section 5 showed that sums can be successfully approximated 
by integrals if the summands are all small compared to the total sum and vary 
smoothly as functions of the summation index. The approximation (5.29), though 
crude, is useful in a wide variety of cases. Sometimes, though, more accurate 
approximations are needed. An obvious way is to improve the bound (5.29). If 
g(x) is really smooth, we can expect that the difference 


n+) 
ay — [ g(u) du 


will vary in a regular way with n. This is indeed the case, and it is exploited by the 
Euler—Maclaurin summation formula. It can be found in many books, such as de 
Bruijn (1958), Graham et al. (1989), Abramowitz and Stegun (1970), and Nérlund | 
(1924). There are many formulations, but they do not differ much. 


Euler-Maclaurin summation formula 
Suppose that g(x) has 2m continuous derivatives in [a,b], a,b € Z. Then 


b b m 
Sra = ff alae + > o {2 90-2” PW} 
r=} , 


k=a i 


‘ (5.62) 
+ 7 Ista) + g(b)} + Rn ’ 
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where 
. Bon(x — [xJ) 
a (2m), 22m 
Ru =~ f gma) ED a, (5.63) 
and so 
Vl < fig ey) Ban ED (5.64) 
(2m)! 
In the above formulas, the B,(x) denote the Bernoulli polynomials, defined by 
e — z" 
i= by By (x) : (5.65) 
n=0 : 


The B,, are the Bernoulli numbers, defined by 


4 ze ‘ 
e—1 dB. ni? C0) 
so that B, = B,(0), and 
Bo=1, By =-1/2, B,=1/6, 


By =Bs=B,=---=0, G8?) 
By=—-1/30, Bo =1/42, Bg =—1/30,.... 


Tt is known that 
\Bom(x — (x})| < [Bam , (5.68) 
so we can simplify (5.64) to 


b 
[Rol < |Baml((2m)!)~! [ ig2™(x)| dx - (5.69) 


There are many applications of the Euler-Maclaurin formula. One of the most 
frequently cited ones is to estimate factorials. 


Example 5.8 (Stirling’s formula). We transform the product in the definition of n! 
into a sum by taking logarithms, and find that for g(x) = log x and m = 1 we have 


logn! = Sriogk = [dogxyar +5 logn+ 5Bo{ > - the Ry, (5.70) 
k=l 


where 


R= [ED ar = crow) (5.71) 
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for 
® B2(x — |x}) 
C= / 53 Ae. (5.72) 
Therefore 
logn! = nlogn—n+ 5 logn+C+13/12+0(n"'), (5.73) 


which gives 
ni~C'n'?n" en" asn— oo. (5.74) 


To obtain Stirling’s formula (4.1), we need to show that C’ = (27)'/2. This can be 
done in several ways (cf. de Bruijn 1958). In Examples 12.1, 12.7, and 12.9 we will 
see other methods of deriving (4.1). bi 


There is no requirement that the function g(x) in the Euler-Maclaurin formula 
be positive. That was not even needed for the crude approximation of a sum by an 
integral given in section 5.0. The function g(x) can even take complex values. [After 
all, eq. (5.62) is an identity!] However, in most applications this formula is used to 
derive an asymptotic estimate with a small error term. For that, some high-order 
derivatives have to be small, which means that g(x) cannot change sign too rapidly. 
In particular, the Euler-Maclaurin formula usually is not very useful when the g(k) 
alternate in sign. In those cases one can sometimes use the differencing trick (cf. 
Example 5.5) and apply the Euler—Maclaurin formula to h(k) = g(2k) + g(2k + 1). 
There is also Boole’s summation formula for alternating sums that can be applied. 
{See chapter 2, section 3 and chapter 6, section 6 of Nérlund (1924), for example.] 
Generalizations to other periodic patterns in the coefficients have been derived by 
Berndt and Schoenfeld (1975/76). 

The bounds for the error term R,, in the Euler—Maclaurin formula that were 
stated above can often be improved by using special properties of the function 
g(x). For example, when g(x) is analytic in x, there are contour integrals for Rin 
that sometimes give good estimates (cf. Olver 1974). 

The Poisson summation formula states that 


dfn +a) = 


for “nice” functions f(x). The functions for which (5.75) holds include all con- 
tinuous f(x) for which f |f(x)| dx < oo, which are of bounded variation, and for 
which 5°, f( + a) converges for all a. For weaker conditions that ensure validity 
of (5.75), we refer to de Bruijn (1958) and Titchmarsh (1948). The Poisson sum- 
mation formula often converts a slowly convergent sum into a rapidly convergent 
one. Generally it is not as widely applicable as the Euler-Maclaurin formula as 
it requires extreme regularity for the Fourier coefficients to decrease rapidly. On 
the other hand, it can be applied in some situations that are not covered by the 
Euler—Maclaurin formula, including some where the coefficients vary in sign. 


co 


S> exp(2sima) 1 f(y) exp(—27imy) dy (5.75) 


m=—oo 
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Example 5.9 (Sum of exp(—ak?)). We consider again the function A(a) of Exam- 
ple 5.3. We let f(x) = exp(—ax?), a =.0. Equation (5.15) then gives 


h(a) = S> exp(—an’) = (n/a)? S~ exp(—a’m’/a) . (5.76) 
This is an identity, and the sum on the right-hand side above converges rapidly for 
small a. Many applications require the evaluation of the sum on the left in which. 
a tends to 0. Equation (5.76) offers a method of converting a slowly convergent 
sum into a tractable one, whose asymptotic behavior is explicit. 


5.4. Bootstrapping and other basic methods 


Bootstrapping is a useful technique that uses asymptotic information to obtain | 
improved estimates. Usually we start with some rough bounds, and by combining 
them with the relations defining the function or sequence that we are studying, we 
obtain better bounds. 


Example 5.10 (Approximation of Bell numbers). Example 5.4 obtained the asymp- 
totics of the Bell numbers B,,, but only in terms of w, the solution to eq. (5.41). 
We now show how to obtain asymptotic expansions for w. As n increases, so does 
w. Therefore log(w + 1) also increases, and so w < n for large n. Thus ; 


n= wlog(w +1) < wlog(n +1), 


and so 

n(log(n+1))'<w<n. (5.77) 
Therefore 

log(w + 1) = logn + O(log logn) , (5.78) - 
and so 

n _ on nloglogn 

= jog(w+1) logn " o( (log)? ) : G2) 

To go further, note that by (5.79), 
= n log logn 
log(w + 1) = log (cea (1 + o( ee ))) (5.80) 


= logn — loglogn + O((loglogn)(logn)-') , 
and so by applying this estimate in eq. (5.41), we obtain 


__n_., nioglogn — n(loglogn)? nlogiogn 
“= Toen * (logn® * (loan)? * OV Cogan)? 

This procedure can be iterated indefinitely to obtain expansions for w with error 

terms O(n(logm)~%) for as large a value of a@ as desired. w 


(5.81) 
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In the above example, w can also be estimated by other methods, such as the 
Lagrange—Biirmann inversion formula (cf. Example 6.7). However, the bootstrap- 
ping method is much more widely applicable and easy to apply. It will be used 
several times later in this chapter. 


5.5. Estimation of integrals 


In some of the examples in the preceding sections integrals were used to approxi- 
mate sums. The integrals themselves were always easy to evaluate. That is true in 
most asymptotic enumeration problems, but there do occur situations where the 
integrals are more complicated. Often the hard integrals arc of the form 


B 
f(x) = / g(t) exp(xh(e)) dt , (5.82) 


and it is necessary to estimate the behavior of f(x) as x — oo, with the functions 
g(t), A(c) and the limits of integration a and f held fixed. There is a substantial 
theory of such integrals, and good references are Bleistein and Handelsman (1975), 
de Bruijn (1958), Erdélyi (1956), and Olver (1974). The basic technique is usually 
referred to as Laplace’s method, and consists of approximating the integrand by 
simpler functions near its maxima. This approach is similar to the one that is 
discussed at length in section 5.1 for estimating sums. The contributions of the 
approximations are then evaluated, and it is shown that the remaining ranges of 
integration, away from the maxima, contribute a negligible amount. By breaking 
up the interval of integration we can write the integral (5.82) as a sum of several 
integrals of the same type, with the property that there is a unique maximum of the 
integrand and that it occurs at one of the endpoints. When a@ > 0, the maximum 
of the integrand occurs for large x at the maximum of h(t) (except in rare cases 
where g(t) = 0 for that ¢ for which h(t) is maximized). Suppose that the maximum 
occurs at t = a > 0. It often happens that 


h(t) = h(a) — c(t — a)? + O(|t - a’) (5.83) 
fora <¢ < B and c = —h"(a)/2 > 0, and then one obtains the approximation 
f(x) ~ g(a) exp(xh(a))[—1/(4xh"(a))}!/?_ as x > co , (5.84) 


provided g(a) 4 0. For precise statements of even more general and rigorous re- 
sults, see for example chapter 3, section 7 of Olver (1974). Those results cover 
functions h(t) that behave near t = a like h(a) — c(t — a)* for any pw > 0. 

When the integral is highly oscillatory, as happens when A(t) = iu(t) for a real- 
valued function u(t), still other techniques (such as the stationary phase method), 
are used. We will not present them here, and refer to the references cited above 
for descriptions and applications. In section 12.1 we will discuss the saddle point 


method, which is related to both Laplace’s method and the stationary phase 
method. 


Laplace integrals 


F(x) = [ ” f(t) exp(—xt) dt (5.85) 
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can often be approximated by inteerenon by parts. We have (under suitable con- 
ditions on f(t)) 


F(x) =x" 'fO)+x"! [roceayat 
0 


=x f+ x 2f(O)+x? * pt yexp(~xt) dt , (5.86) 
7 Q 


and so on. There are general results, usually associated with the name of Watson’s 


Lemma, for deriving such expansions. For references, see Erdélyi (1956) or Olver 
(1974). 


6. Generating functions 


6.1. A brief overview 


Generating functions are a wonderfully powerful and versatile tool, and most 
asymptotic estimates are derived from them. The most common ones in com- 
binatorial enumeration are the ordinary and exponential generating functions. If 


4,@1,..., iS any sequence of real or complex numbers, the ordinary generating 
function is 
Cc 
F(z) = Do anz", (6.1) 
n=0 
while the exponential generating function is 
ao a z" 
1) ae ese ee . (6.2) 
no) 


Doubly-indexed arrays, for example a,,, 0 <n < oo, O< k <n, are encoded as 
two-variable generating functions. Depending on the array, sometimes one uses 


f(x,y) = SY eae! y", . (6.3) 


n=0 k=0 


and sometimes other forms that might even mix ordinary and exponential types, 
as in 


f(x,y) = ye at (6.4) 
n=0 


For example, the Stirling numbers of the first kind, s(n, k) ((—1)"** s(n, k) is the 
number of permutations on n letters with k cycles) have the generating function 
(see pp. 50, 212-213, and 234-235 in Comtet 1974): 


14% s(n a! =(1+yy. (65) 
n=1 ~" k=l 
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In general, a generating function is just a formal power series, and questions of 
convergence do not arise in the definition. However, some of the main applications 
of generating functions in asymptotic enumeration do rely on analyticity or other 
convergence properties of those functions, and there the domain of convergence 
is important. 

A generating function is just another form for the sequence that defines it. 
There are many reasons for using it. One is that even for complicated sequences, 
generating functions are frequently simple. This might not be obvious for the 
partition function p(n), which has the ordinary generating function 


f(z) = do p@z" = [Ja-z)". (6.6) 
n=0 k=) 


The sequence p(n), which is complicated, is encoded here as an infinite product. 
The terms in the product are simple and vary in a regular way with the index, but 
it is not clear at first what is gained by this representation. In other cases, though, 
the advantages of generating functions are clearer. For example, the exponential 
generating function for derangements [eq. (1.1) and Example 5.6] is 


fle)= oe -y4 ee a 
=0 


n=0 
k= n=k 


which is extremely compact. 

Reasons for using generating functions go far beyond simplicity. The one that 
matters most for this chapter in that generating functions can be used to obtain 
information about the asymptotic behavior of sequences they encode, informa- 
tion that often cannot be obtained in any other way, or not as easily. Methods 
such as those of section 10.2 can be used to obtain immediately from eq. (6.7) 
the asymptotic estimate d, ~ e~'n! as n —+ oo. This estimate can also be derived 
easily by elementary methods from eq. (1.1), so here the generating function is 
not essential. In other cases, though, such as that of the partition function p(v), 
all the sharp estimates, such as that of Hardy and Ramanujan given in eq. (1.5), 
are derived by exploiting the properties of the generating function. If there is any 
main theme to this chapter, it is that generating functions are usually the easiest, 
most versatile, and most powerful way to study asymptotic behavior of sequences. 
Especially when the generating function is analytic, its behavior at the dominant 
singularities (a term that will be defined in section 10) determines the asymptotics 
of the sequence. When the gencrating function is simple, and often even when it is 
not simple, the contribution of the dominant singularity can often be determined 
easily, although the sequence itself is complicated. 

There are many applications of generating functions, some related to asymptotic 
questions. Averages can often be studied using gencrating functions. Suppose, for 
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example, that a,,, 0 <k <n, 0 <n < oo, is the number of objects in some class 
of size n, which have weight k (for some definition of size and weight), and that 
we know, either explicitly or implicitly, the generating function f(x, y) of 4,4 given - 
by (6.4). Then 


80) =f = ane | (68)- 


n=0 k=0 


is the exponential generating function of the number of objects of size n, while 


a co, A 
ho) = SF), = Do kane (69). 


n=0 


is the exponential generating function of the sum of the weights of objects of size 
n. Therefore the average weight of an object of size n is 


ee | (6.10) 


The wide applicability and power of generating functions come primarily from 
the structured way in which most enumeration problems arise. Usually the class of 
objects to be counted is derived from simpler objects through basic composition 
rules. When the generating functions are chosen to reflect appropriately the classes 
of objects and composition rules, the final generating function is derivable in a 
simple way from those of the basic objects. Suppose, for example, that each object 
of size n in class C can be decomposed uniquely into a pair of objects of sizes kK and 
n —k (for some k) from classes A and B, and each pair corresponds to an object 
in C. Then c,, the number of objects of size n in C, is given by the convolution 


Cn = So agbn- ’ (6.11) 
k=0 
(where a, is the number of objects of size k in A, etc.). Hence if A(z) = SS a,z", 
B(z) = Y5 baz", C(z) = Y5cnz" are the ordinary generating functions, then 
C(z) = A(z) B(z) . (6.12) 


Thus ordered pairing of objects corresponds to multiplication of ordinary gener- 
ating functions. If A(z) = }*a,z" and 


n 
by = Soa ? 
k=0 


then B(z) = > b,z" is given by 


B(z) = Ate) : (6.13) 
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so that the ordinary generating function of cumulative sums of coefficients is ob- 
tained by dividing by 1 — z. There are many more such general correspondences 
between operations on combinatorial objects and on the corresponding generat- 
ing functions. They are present, implicitly or explicitly, in most books that cover 
combinatorial enumeration, such as Comtet (1974), Goulden and Jackson (1983), 
Stanley (1986), and Wilf (1990). The most systematic approach to developing and 
using general rules of this type has been carried out by Flajolet et al. (1991b). 
They develop ways to see immediately (cf. Flajolet and Odlyzko 1990a) that if 
we consider mappings of a set of n labeled elements to itself, so that all n” dis- 
tinct mappings are considered equally likely, then the generating function for the 
longest path length is given by 


f(z) = » Cera - ont) ; (6.14) 
where 

Ua(2) = teal) + she ale)? H+ FOIE (6.15) 
with 

to(z)=Z, thsi(z) =z exp(t,(z)) , (6.16) 


and t(z) =lim,_,.o f(z) (in the sense of formal power series, so convergence is 
that of coefficients). Furthermore, as is mentioned in section 17, many of these 
rules for composition of objects and generating functions can be implemented 
algorithmically, automating some of the chores of applying them. 

We illustrate some of the basic generating function techniques by deriving the 
generating function for rooted labeled trees, which will occur later in Examples 6.6 


and 10.14. (The rooted unlabeled trees, with generating function given by (1.8), 
are harder.) 


Example 6.1 (Rooted labeled trees). Let t, be the number of rooted labeled trees 


on n vertices, so that 4) = 1, & = 2, fj = 9. (It will be shown in Example 6.6 that 
t, =n.) Let 


t(z) = ¥ ne (6.17) 
n=1 


be the exponential generating function. If we remove the root of a rooted labeled 
tree with n vertices, we are left with k > 0 rooted labeled trees that contain a total 
of n — 1 vertices. The total number of ways of arranging an ordered selection of 
k rooted trees with a total of n — 1 vertices is 


fe ies 


Since the order of the trees does not matter, we have 


ale eet 
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different trees of size n that have exactly k subtrees, and so 


=>, ale" Me) 
k=0 . 


= [2"") $0 a(z)* /k! = [z"]z exp(e(z)) , (6.18) 
k=0 
which gives 
t(z) = zexp(t(z)) . (6.19) 
As an aside, the function ¢,(z) of eq. (6.16) is the exponential generating function 
of rooted labeled trees of height < h. Q 


The key to the successful use of generating functions is to use a generating 
function that is of the appropriate form for the problem at hand. There is no 
simple rule that describes what generating function to use, and sometimes two are 
used simultaneously. In combinatorics and analysis of algorithms, the most useful 
forms are the ordinary and exponential generating functions, which reflects how 
the classes of objects that are studied are constructed. Sometimes other forms are 
used, such as the double exponential form 


f2=> cae (6.20) 


n=0 


that occurs in section 7, or the Newton series 


fl) = ale -1)---@— 041). (621) 


n=0 


Also frequently encountered are various g-analog generating functions, such as 
the Eulerian 


Anz" 


$@)= qa) a 


In multiplicative number theory, the most common are Dirichlet series 
OO 
f(z) = oan , (6.23) 
n=l 


which reflect the multiplicative structure of the integers. If a, is a multiplicative 
function (so that an, = a,a@, for all relatively prime positive integers m and n) 
then the function (6.23) has an Euler-product representation 


f(z) = [[Q + app * +ayp-* +---), (6.24) 
P 
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where p runs over the primes. This allows new tools to be used to study f(z) and 
through it a,. Additive problems in combinatorics and number theory often are 
handled using functions such as 


f(z) =o 2, (6.25) 
n=t 


where 0 < a; < a2 < --- is a sequence of integers. Addition of two such sequences 
then corresponds to a multiplication of the generating functions of the form (6.25). 

We next mention the “snake-oil” method. This is the name given by Wilf (1990) 
to the use of generating functions for proving identities, and comes from the sur- 


prising power of this technique. The typical application is to evaluation of se- 
quences given by sums of the type 


On = Do bag - (6.26) 
k 


The standard procedure is to form a generating function of the a, and manipulate it 
through interchanges of summation and other tricks to obtain the final answer. The 
generating function can be ordinary, exponential, or (less commonly) of another 
type, depending on what gives the best results. We show a simple application of 
this principle that exhibits the main features of the method. 


Example 6.2 (A binomial coefficient sum, Wilf 1990). Let 


"(n+k\ 14g 
= (He)? , n>o. (6.27) 


We define A(z) to be the ordinary generating function of a,. We find that 
AC) = Ya =y Das a on 
=0 
= n+k nt+k 
=> eas, ie 2 k(2z)* rae ea 


k=0 nk 
oo (oe) k 
% = 2z)7* 1 Zz 
= or tee = te h(E5) 
az Kel L = 
or (1 — 2z)2k+ 1—2z er 1-—2z 
1-22 2 1 
a eS py eS 6.28 
(l-4z)Q@-—z) 31-42) : 3(1 — z) Ce) 
Therefore we immediately find the explicit form 
a, = (271 +1)/3 forn >0. (6.29) 


RT orn 
AY | eet 
OY 
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We next present some additional examples of how generating functions are de- 
rived. We start by considering linear recurrences with constant coefficients. 

The first step in solving a linear recurrence is to obtain its generating function. 
Suppose that a sequence apo, a;,4@2,... satisfies the recurrence 


Qn =) Cian, n>d. (6.30) 
Then 
F(Z) = Do ana" = Saw Sr Sra: (6.31) 
n=0 n=d i=l 
d-1\ d ios 
= Soanz" + Yaz! So an-iz 
a=0 i=l n=d 
d-1 d-i-1 
Gn" Yad (r- YS anz ‘ 
n=0 
and so ; 
Zz 
f= 8 , (6.32) 
1 vee i=l cz! 
where 


d-i-] 


g(z) = Yat - Sad > On" (6.33) 


is a polynomial of degree < d -- 1. Equation (6.32) is the fundamental relation in - 


the study of linear recurrences, and 1 — }>c¢;z! is called the characteristic polyno- 
mial of the recursion. 


Example 6.3 (Fibonacci numbers). We let Fy = 0, Fy = 1, Pa = Fao + Fy,.2 forn > 
2, and 


F(z) =o Fra” 


n=0 
Then by (6.32) and (6.33), 
z 
=, o 4 
F@) =; 5 (6.34) 
& 


Often there is no obvious recurrence for the sequence a, being studied, but 
there is one involving some other auxiliary function. Usually if one can obtain at 
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least as many recurrences as there are sequences, one can obtain their generating 
functions by methods similar to those used for a single sequence. The main addi- 
tional complexity comes from the need to solve a system of linear equations with 
polynomial coefficients. We illustrate this with the following example. 


Example 6.4 (Sequences with forbidden subwords). Let A = a\a2---a, be a bi- 
nary string of length k. Define fa() to be the number of binary strings of length n 
that do not contain A as a subword of k adjacent characters. (Subsequences do not 
count, so that if A = 1110, then A is contained in 1101110010, but not in 101101.) 
We introduce the correlation polynomial C,(z) of A: 


k-1 
Calz) = VoeaWz/ , (6.35) 


j=0 


where c,4(0) = 1 and for1 <j <k-1, 


1 i Sap = ajdag tag. 
a=| DL ARC em la (6.36) 


0 otherwise . 


As examples, we note that if A = 1000, then C,4(z) = 1, whereas Cy(z) =1+2z+ 
z2 +23 if A = 1111. The generating function 


F4(z) = Do fa(n)z" (6.37) 
then satisfies 
- C4(z) 
eS ae (1 —2z)Ca(z) © 9) 


To prove this, define g4(n) to be the number of binary sequences b,b2---b, of 
length n such that b;b2---b, = A, but such that bjbj. ---bj.4-1 # A for any j with 
2<j<n—k+1;i.e., sequences that start with A but do not contain it anywhere 
else. We then have g4(n) = 0 for n < k, and ga(k) = 1. We also define 


Ga(z) = S> galn)z” (6.39) 


n=0 


We next obtain a relation between Ga(z) and F4(z) that will enable us to deter- 
mine both. 


If bbz ---b, is counted by f,(x), then for x either 0 or 1, the string xb, bz ---b, 
either does not contain A at all, or if it does contain it, then A = xb,b2---by_,. 
Therefore for n > 0, 


2fa(n) = fa(n +1) + ga(n +1) (6.40) 
and multiplying both sides of eq. (6.40) by z” and summing on 7 > 0 yields 
2Fa(z) = z7'(Fa(z) — 1) +z'Galz) . (6.41) 
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We need one more relation, and to obtain it we consider any string B = b;b2---b, 
that does not contain‘A anywhere inside. If we let C be the concatenation of A and 
B, so that C = aja)---a,b,)b2---b,, then C starts with A, and may contain other 
occurrences of A, but only at positions that overlap with the initial A. Therefore 
we obtain, 
ke oo 
far) = SY) gan tj) forn>0, e (6.42) 
j=l 


calk-p=t 


and this gives the relation 
Fa(z) = 2*C4(z)Ga(z) « (6.43) 


Solving the two equations (6.41) and (6.43), we find that F4(z) satisfies (6.38), 
while 


zk 


zk + (1 —2z)Ca4(z) © (em) 


The proof above follows that in Guibas and Odlyzko (1981), except that that 
paper uses generating functions in z~', so the formulas look different. Applications 
of the formulas (6.38) and (6.44) will be found later in this chapter, as well as 
in Guibas and Odlyzko (1981) and Flajolet et al. (1988). Other approaches to 
string enumeration problems are referenced there as well. Other approaches and 
applications of string enumerations are given in the references in Guibas and 
Odlyzko (1981) and in papers such as Arratia et al. (1990b). ® 


Galz) = 


The above example can be generalized to provide generating functions that 
enumerate sequences in which any of a given set of patterns are forbidden (Guibas 
and Odlyzko 1981). 

Whenever one has a finite system of linear recurrences with constant coeffi- 
cients that involve several sequences, say a), 1<i<k,n 20, one can translate 
these recurrences into linear equations with polynomial coefficients in the gener- 
ating functions A(z) = Sy a‘z” for these sequences. To obtain the A(z), one 
then needs to solve the resulting system. Such solutions will exist if the matrix of 
polynomial coefficients is nonsingular over the field of rational functions in z. In 
particular, one needs at least as many equations (i.e., recurrence relations) as k, the 
number of sequences, and if there are exactly as many equations as sequences, then 
the determinant of the matrix of the coefficients has to be a nonzero polynomial. 

One interesting observation is that when a system of recurrences involving sev- 
eral sequences is solved by the above method, each of the generating functions 
A(z) is a rational function in z. What this means is that each of the sequences 
a, 1 <i<k, satisfies a linear recurrence with constant coefficients that does not 
involve any of the other al) sequences! In principle, therefore, that recurrence 
could have been found right at the beginning by combinatorial methods. How- 
ever, usually the degree of the recurrence for an isolated a) sequence is high, 
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typically ot k times as large as the average degree of the k recurrences involv- 
ing all the a Thus the use of several sequences a) leads to much simpler and 
combinatorially more appealing relations, 

That generating functions can significantly simplify combinatorial problems is 
shown by the following example. It is taken from Stanley (1978), and is a modifi- 
cation of a result of Klarner (1968) and Pélya (1969). This example also shows a 


more complicated derivation of explicit generating functions than the simple ones 
presented so far. 


Example 6.5 (Polyomino enumeration, Stanley 1978). Let a, be the number of n- 
square polyominoes P that are inequivalent under translation, but not necessarily 
under rotation or reflection, and such that each row of P is an unbroken line of 
squares. Then a; = 1, a) = 2, ay = 6. We define ap = 0. It is easily seen that 


Bn = (my + my — 1)(m, + M3 — 1) ++ (mg. +m ~ 1), (6.45) 


where the sum is over all ordered partitions m,+---+m; =n of n into positive 
integers m;. Let a,, be the sum of terms in (6.45) with m, =r, where we set 
G@nn = 1, and a,, =O ifr >norn <0. Then 


Qn= arn, (6.46) 
oo 
Orn = rth Vain, <n. (6.47) 
ist 
Define 
oo 6 
A(x,y) = Sos Gey" ‘ (6.48) 
n=} r=} 
so that 
A(1,y) = So any” (6.49) 
n=1 


is the generating function of the a,, which are what we need to estimate. 
By (6.47), we find that 


co 060 


A(x, y) = ye y" +e +i — 1)aj(n —r)x’y” (6.50) 


n=l r=1 i=l 
xy xy? : 
a react: ere 6.51 
Lay aay A(l,y) +7 wee (6.51) 
where 
co 0 ; . r,) 

GO) =D Yoiay" = FA] (6.52) 

n=1 i=1 at 
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We now set x = 1 in-(6.50) and obtain an equation involving A(1,y) and G(y), 
namely 


Al =77,* G x ayy Aton) + 


We next differentiate (6.50) with respect to x, and set x = 1. This gives us a second 
equation, 


you): (6.53) 


G(y) = 5A(i,y) + oO? 3 (6.54) 


y 2y? 
(—y? * @—y) 
We now eliminate G(y) from (6.53) and (6.54) to obtain 


y(l~y)? 


1 = oS - 6.5 
A(Ly) 1 — Sy + Ty? — 4y3 (22) 
This formula shows that 
Oni = 4n.2 — Tdyy, +4, forn 22. (6.56) 


Using the results of section 10 we can easily obtain from (6.55) an asymptotic ‘ 
estimate 


a,~ca" asn—oo, (6.57) 


where c is a certain constant and @ = 3.205569. .. is the inverse of the smallest 
zero of 1 — Sy + Ty? — 4y?. & 


For other methods and results related to polyomino enumeration, see Privman 
and Svrakic (1988, 1989). 
6.2. Composition and inversion of power series 


So far we have only discussed simple operations on generating functions, such as - 
multiplication. What happens when we do something more complicated? There 
are several frequently occurring operations on generating functions whose results 
can be described explicitly. 


Fada di Bruno’s formula (Comtet 1974). Suppose that 


A(z) = Yan; Be) = Yon, (6.58) 


m=0 n=0 


are two exponential generating functions with by = 0. Then the formal composition 
C(z) = A(B(z)) is well-defined, and 


ca= oe (6.59) 
n=0 
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with 


n 
c= 0, Ca = > a.Brx (1, bo, any Bnet) ’ (6.60) 
k=1 


where the B,,, are the exponential Bell polynomials defined by 


So Bee Gees +29 Xq- oe =e (uS “ie, 5) ’ (6.61) 


n,k=0 


with the x; independent variables. 

Faa di Bruno’s formula makes it possible to compute successive derivatives of 
functions such as log A(z) in terms of the derivatives of A(z). For further examples, 
see Comtet (1974), Riordan (1958, 1968). Faa di Bruno’s formula is derivable in a 
straightforward way from the multinomial theorem. 

Composition of generating functions occurs frequently in combinatorics and 
analysis of algorithms. When it yields the desired generating function as a com- 
position of several known generating functions, the basic problem is solved, and 
one can work on the asymptotics of the coefficients using Faa di Bruno’s for- 
mula or other methods. A more frequent event is that the composition yields a 
functional equation for the generating function, as in Example 6.1, where the ex- 
ponential generating function t(z) for labeled rooted trees was shown to satisfy 
t(z) = zexp(t(z)). General functional equations are hard to deal with. (Many ex- 
amples will be presented later.) However, there is a class of them for which an 
old technique, the Lagrange—-Biirmann inversion formula, works well. We start by 
noting that if 


00 
f(z) = S- faz" (6.62) 
n=0 
is a formal power series with fo = 0, f; 4 0, then there is an inverse formal power 
series f(-")(z) such that 


FENG) =fO%YQ) =z. | (82) 


The coefficients of f(z) can be expressed explicitly in terms of the coefficients 
of f(z). More generally, we have the following result. 


Lagrange-Biirmann inversion formula. Suppose that f(z) is a formal power series 
with [z°} f(z) = 0, {z']f(z) #0, and that g(z) is any formal power series. Then for 


n>1, 
[2"HgFOP)(z)} = 0 [2""}fe'(z)F(2)/2)"} (6.64) 


In particular, for g(z) = z, we have 


[Uf P(z) =a 2" @)/z)" - (6.65) 
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Example 6.6 (Rooted labeled trees). As.was shown in Example 6.1, the exponen- 
tial generating function of rooted labeled trees satisfies 1(z) = z exp(t(z)). If 
we rewrite it as z = t(z)exp(—t(z)), we see that t(z) = f(-(z), where f(z) = 
zexp(—z). Therefore eq. (6.65) yields 


[2"]¢(z) =n" '[z"""] exp(—nz) 


=n n""/(n = 1)! = n"! In! , (6.66) 


which shows that ¢,, the number of rooted labeled trees on 2 nodes, isn™'. & 


Proof of a form of the Lagrange—Biirmann theorem is given in chapter 21. Ex- 
tensive discussion, proofs, and references are contained in Comtet (1974), Goulden 
and Jackson (1983), Henrici (1974-86), and Whittaker and Watson (1927). Some 
additional recent references are Garsia and Joni (1977) and Hofbauer (1979). 
There exist generalizations of the Lagrange-Biirmann formula to several variables 
(Goulden and Jackson 1983, Good 1960, Hofbauer 1979). 

The Lagrange—-Biirmann formula, as stated above, is valid for general formal 
power series. If f(z) is analytic in a neighborhood of the origin, then so are f(-"(z) 
and g(f‘~!))(z), provided g(z) is also analytic near 0 and f’(0) 4 0, f(0) = 0. Most 
of the presentations of this inversion formula in the literature assume analyticity. 
However, that is not a real restriction. To prove (6.65), say, in full generality, it 
suffices to prove it for any n. Given n, if we let 


FQ) =o fiz*, GO= Vaz, 


k=0 k=0 


then we see that 


2"1{e FP )(z)} = [2"IGEFOY)(2) , (6.67) 


and F(z) and G(z) are analytic, so the formula (6.65) can be applied. Thus combi- 
natorial proofs of the Lagrange—-Biirmann formula do not offer greater generality 
than analytic ones. 

While the analytic vs. combinatorial distinction in the proofs of the Lagrange— 
Biirmann formula does not matter, it is possible to use analyticity of the functions 
f(z) and g(z) to obtain useful information. Example 6.6 above was atypical in 
that a simple explicit formula was derived. Often the quantity on the right-hand 
side of (6.64) is not explicit enough to make clear its asymptotic behavior. When 


that happens, and g(z) and f(z) are analytic, one can use the contour integral 
representation 


le "He UO) = 55 f '@EY de, (6.68) 
f 


where I’ is a positively oriented simple closed contour enclosing the origin that lies 
inside the region of analyticity of both g(z) and f(z). This representation, which is 
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discussed in section 10, can often be used to obtain asymptotic information about 
coefficients [z”|g(f‘")(z) (cf. Mallows et al. 1975). 

The Lagrange—Biirmann formula can provide numerical approximations to roots 
of equations and even convergent infinite-series representations for such roots. An 
important case is the trinomial equation y = z(1 + y’), and there are many others. 


Example 6.7 (Dominant zero for forbidden-subword generating functions). The 
generating functions F,4(z) and G,4(z) of Example 6.4 both have denominators 
h(z) = z* + (1 —2z)C(z) , (6.69) 


where C(z) is a polynomial of degree < k, with coefficients 0 and 1, and with 
C(0) = 1. It will be shown later that A(z) has only one zero p of small absolute 
value, and that this zero is the dominant influence on the asymptotic behavior of 
the coefficients of F4(z) and G,(z). Right now we obtain accurate estimates for 
p. 
For simplicity, we will consider only large k. Since C(z) has nonnegative co- 
efficients and C(0) = 1, h(3/4) < (3/4)* — 1/2 < 0 for k > 3. On the other hand, 
h(1/2) = 2-*. Therefore h(z) has a real zero p with 1/2 < p < 3/4. As k > «, 
p — 1/2, since 


p* = (2p ~ 1)C(p) , (6.70) 


and p* — 0 as k ~ oo for 1/2 < p < 3/4, while 2p -- 1 and C(p) are bounded. We 
can deduce from (6.69) that 


2p—-1~2*C(1/2)"' ask, (6.71) 
uniformly for all polynomials C(z) of the prescribed type. By applying the boot- 


strapping technique (sce scction 5.4) we can find even better approximations. By 
(6.71), 


C(p) = C(1/2) + O(\p — 1/2]) = C(1/2)+ 02) , (6.72) 

p* =2°*(14 0(2-*))* = 2-1 + O(K2*) , (6.73) 
so (6.70) now yields 

p = 1/2+2-* (1/2) + O(k2-) . (6.74) 


Even better approximations can be obtained by repeating the process using (6.74). 
At the next stage we would apply the expansion 


C(p) =C(1/2) + (p — 1/2)C"(1/2) + O((p — 1/2)’ 


(6.75) 
= C(1/2) +2-*-'C"(1/2) + O(k2-2*) 


and a similar one for p*. 
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A more systematic way to obtain a rapidly convergent series for p is to use the 


inversion formula. If we set uw = p — 1/2, then (6.70) can be rewritten as w(u) = 1, 
where 


w(u) = 2uC(1/2 + u)(1/2+u)* = So aj , (6.76) 
of ty @ j-\ 
with 


a, = 2**'C(1/2) £0. (6.77) 


Hence u = wi (1), and the Lagrange—Biirmann inversion formula (6.65) yields 
the coefficients of w‘-"(z). In particular, we find that 


p=1/2+ux1/24+2-* 'C(1/2)* + k2-*"'C(1/2)? 
—2-*k-20'(1 /2)C (1/2) 3 +: (6.78) 


as a Poincaré asymptotic series. With additional work one can show that the series 
(6.78) converges, and that 


p = 1/242-1C(1/2)-! + k2-1C(1/2)? 6.279 
~2-*k-201(1/2)C(1/2)-3 + O24) , ee 


for example. The same estimate can be obtained by the bootstrapping technique. 


6.3. Differentiably finite power series 


Homogeneous recurrences with constant coefficients are the nicest large set of se- 
quences one can imagine, with rational generating functions, and well-understood 
asymptotic behavior. The next class in complexity consists of the polynomially 
recursive or, P-recursive sequences, ay, a), ..., which satisfy recurrences of the form 


Pal) 4nd + Pa-(M)Onia_-1 + °° +Pol(n)an =0, no, (6.80) 


where d is fixed and po(n),...,pa(n) are polynomials in n. Such sequences are 
common in combinatorics, with a, =n! a simple example. Normally P-recursive 
sequences do not have explicit forms for their generating functions. In this section 
we briefly summarize some of their main properties. Asymptotic properties of P- 
recursive sequences will be discussed in section 9.2. The main references for the 
results quoted here are Lipshitz (1989) and Stanley (1980). 

A formal power series 


f(z) = yay! (6.81) 
k=0 


is called differentiably finite, or D-finite, if the derivatives f(z) = {2 n>0, 
span a finite-dimensional vector space over the field of rational functions with 
complex coefficients. The following three conditions are equivalent for a formal 
power series f(z): 
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(i) f(z) is D-finite. 
(ii) There exist finitely many polynomials qo(z),...,4,(z) and a polynomial 
q(z), not all 0, such that 


dn (Z)SO(z) + ++ + qolz)f(z) = a(z) - (6.82) 
(iii) There exist finitely many polynomials po(z), . .., Pm(Z), not all 0, such that 
Pm(z)f™(z) +--+ po(z)f(z) =0. (6.83) 


The most important result for combinatorial enumeration is that a sequence 
ao, 4,..., iS P-recursive if and only if its ordinary generating function f(z), defined 
by (6.81), is D-finite. This makes it possible to apply results that are more easily 
proved for D-finite power series. 

If f(z) is D-finite, then so is the power series obtained by changing a finite 
number of the coefficients of f(z). If f(z) is algebraic (i.e., there exist polynomials 
qo(Z),---,Ga(z), not all 0, such that gq(z)f(z)4 +--+ + go(z)F(z) + qo(z) = 0), then 
f(z) is D-finite. The product of two D-finite power series is also D-finite, as is any 
linear combination with polynomial coefficients. Finally, the Hadamard product 
of two D-finite series is D-finite. The proofs rely on elementary linear-algebra 
constructions. An important feature of the theory is that identity between D-finite 
series is decidable. 

The concept of a D-finite power series can be extended to several variables 
(Lipshitz 1989, Zeilberger 1990), and there are generalizations of P-recursiveness 
(Lipshitz 1989, Zeilberger 1990). (See also Gessel 1990.) Zeilberger (1990) has used 
the word holonomic to describe corresponding sequences and generating functions. 

When we investigate a sequence {a,,}, sometimes the combinatorial context 
yields only relations for a more complicated object with several indices. While 
we might like to obtain the generating function f(z) = }> a,z", we might instead 
find a formula for a generating function 


F (21, 22) ++ 652) = p> DisinrGteee ogee 5 (6.84) 


ured 


where ay = Dann, Say. When this happens, we say that f(z) is a diagonal of 
F(z,...,2,). [There are more general definitions of diagonals in Denef and Lip- 
shitz (1987), Lipshitz (1988, 1989), and Lipshitz and van der Poorten (1990), which 
are recent references for this topic.] Diagonals of D-finite power series in any 
number of variables are D-finite. Diagonals of two-variable rational functions are 
algebraic, but there are three-variable rational functions whose diagonals are not 
algebraic (Furstenberg 1967). 


6.4. Unimodality and log-concavity 


A finite sequence ag,a;,...,@, of real numbers is called unimodal if for some 
index k, a9 < a; ++: <a, and ay > ay, 2-+- Dan. A Sequence ag,...,@, of 
nonnegative elements is called log-concave (short for logarithmically concave) if 
a’ > aj_14j4, holds for 1 < j <n — 1. Unimodal and log-concave sequences occur 
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frequently in combinatorics and are objects of intensive study. We present a brief 
review of some of their properties because asymptotic methods are often used 
to prove unimodality and log-concavity. Furthermore, knowledge that a sequence 
is log-concave or unimodal is often helpful in obtaining asymptotic information. 
For example, some methods provide only asymptotic estimates for summatory 
functions of sequences, and unimodality helps in obtaining from those estimates 
bounds on individual coefficients. This approach will be presented in section 13, 
in the discussion of central and local limit theorems. 

The basic references for unimodality and log-concavity are Karlin (1968) and 
Stanley (1989). For recent results, see also Brenti (1989) and the references given 
there. All the results listed below can be found in those sources and the references 
they list. 

In the rest of this subsection we will consider only sequences of nonnegative 
elements. A sequence ao,...,@, will be said to have no internal zeros if there 
is no triple of integers 0 <i < j < k <n such that a; = 0, aja, # 0. It is easy to 
see that a log-concave sequence with no internal zeros is unimodal, but there are 
sequences of positive elements that are unimodal but not concave. The convolution 
of two unimodal sequences does not have to be unimodal. However, it is unimodal 
if each of the two unimodal sequences is also symmetric. Convolution of two 
log-concave sequences is log-concave. The convolution of a log-concave and a 
unimodal sequence is unimodal. A log-concave sequence is even characterized by 
the property that its convolution with any unimodal sequence is unimodal. This last 
property is related to the variation-diminishing character of log-concave sequences 
(see Karlin 1968), which we will not discuss at greater length here except to note 
that there are more restrictive sets of sequences (the Pdlya frequency classes, see 
Brenti 1989, Karlin 1968) which have stronger convolution properties. 

The binomial coefficients @): 0<k <n, are log-concave, and therefore uni- 
modal. The g-binomial coefficients (il, are log-concave for any q > 1. On the 


other hand, if we write a single coefficient (il, for fixed n and k as a polynomial 
in q, the sequence of coefficients is unimodal, but does not have to be log-concave. 


The most frequently used method for showing that a sequence apo, ..., a, iS log- 
concave is to show that all the zeros of the polynomial 
n 
A(z) = > az" (6.85) 
k=0 


are real and < 0. In that case not only are the a, log-concave, but so are a,(")'. 
Absolute values of the Stirling numbers of both kinds were first shown to be log- 
concave by this method (Harper 1967). There are many unsolved conjectures about 
log-concavity of combinatorial sequences, such as the Read—-Hoggar conjecture that 
coefficients of chromatic polynomials are log-concave (cf. Brenti et al. 1994). 

A variety of combinatorial, algebraic, and geometric methods have been used 
to prove unimodality of sequences, and we refer the reader to Stanley (1989) for 
a comprehensive and insightful survey. In section 12.3 we will discuss briefly some 
proofs of unimodality and log-concavity that use asymptotic methods. The basic 
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philosophy is that since the Gaussian distribution is log-concave and unimodal 
(when we extend the definition of these concepts to continuous distributions), 
these properties should also hold for sequences that by the central limit theorem 
or its variants are asymptotic to the Gaussian. Therefore one can expect high-order 
convolutions of sequences to be log-concave at least in their central region, and 
there are theorems that prove this under certain conditions. 


6.5. Moments and distributions 


The second-moment method is a frequently used technique in probabilistic argu- 
ments, as is shown in chapter 33 and Bollobds (1985), Erdés and Spencer (1974), 
and Spencer (1987). It is based on Chebyshev’s inequality, which says that if X is 
a real-valued random variable with finite second moment E(X7”), then 


E(X?) — E(Xy? 


Prob (|X — E(X)| 2 alE(X)]) < EX (6.86) 
An easy corollary of inequality (6.86) that is often used is 
E(X?) - E(x 
b (X = 0) < —  .. ; 
Prob (X = 0) Ey (6.87) 


[There is a slightly stronger version of the inequality (6.87), in which E(X)* in 
the denominator is replaced by E(X7).] The inequalities (6.86) and (6.87) are 
usually applied for X = Y, +---+ Y,, where the Y; are other random variables. 
The helpful feature of the inequalities is that they require only knowledge of the 
pairwise dependencies among the Y;, which is easier to study than the full joint 
distribution of the Y,. For other bounds on distributions that can be obtained from 
partial information about moments, see Shohat and Tamarkin (1943). 

The reason moment bounds are mentioned at all in this chapter is that asymp- 
totic methods are often used to derive them. Generating functions are a common 
and convenient method for doing this. 


Example 6.8 (Waiting times for subwords). In a continuation and application of 
Example 6.4, let A be a binary string of length k. How many tosses of a fair coin 
(with sides labeled 0 and 1) are needed on average before A appears as a block 
of k consecutive outcomes? By a general observation of probability theory, this is 
just the sum over n 2 0 of the probability that A does not appear in the first 1 
coin tosses, and thus equals 


Yo fa(n2™ = Fa (1/2) = 2*C4(1/2) , (6.88) 
ano 


where the last equality follows from eq. (6.38). Another, more general, way to 
derive this is to use G,(z). Note that g,(m)2~" is the probability that A appears 
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in the first n coin tosses, but not in the first n — 1. Hence the rth moment of the 
time until A appears is 


dYon'galn)2” = (3) Ga(z) 
n=0 


If we take r = 1, we again obtain the expected waiting time given by (6.88). When 
we take r = 2, we find that the second moment of the time until the appearance 
of A is ; 


(6.89) 


z=1/2 


5 nga(n)2-" = 27410, (1/2)? — (2k — 1)2*C4(1/2) + 2*C), (1/2) , 


n=0 
(6.90) 
and therefore the variance is 
27kC, (1/2)? — (2k ~ 1)2*C4 (1/2) + 2* ci, (1/2) 
2k 2 k (6.91) 
= 2*C4(1/2)° + O(K2"*) , 


since 1 < C,4(1/2) < 2. Higher moments can be used to obtain more detailed in- 
formation. A better approach is to use the method of Example 9.3, which gives 
precise estimates for the tails as well as the mean of the distribution. ps 


Information about moments of distribution functions can often be used to obtain 
the limiting distribution. H #,(x) is a sequence of distribution functions such that 
for every integer k > 0, the kth moment 


Peu(k) = [x dF, (x) (6.92) . 


converges to j:(k) as m —> oo, then there is a limiting measure with distribution 
function F(x) whose kth moment is 4(k). If the moments z(k) do not grow too. 
rapidly, then they determine the distribution function F(x) uniquely, and the F,(x) 
converge to F(x) (in the weak star sense, Billingsley 1979). A sufficient condition 
for the {k) to determine F(x) uniquely is that the generating function 


oo k 
U(x) = Do a ite (6.93) 


should converge for some x > 0. In particular, the standard normal distribution 
with 


F(x) = (Qa)71? a exp(—u?/2) du - (6.94) 
has p(2k)=1-3-5-7----- (2k —1) (and p(2k +1) =0), so it is determined 


uniquely by its moments. On the other hand, there are some frequently encoun- 
tered distributions, such as the log-normal one, which do not have this property. 
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7. Formal power series 


This section discusses generating functions f(z) that might not converge in any 
interval around the origin. Sequences that grow rapidly are common in combina- 
torics, with a, = n! the most obvious example for which 


f(z) = Yo anz" (7.1) 
n=0 


does not converge for any z # 0. The usual way to deal with the problem of a 
rapidly growing sequence a, is to study the generating function of a,/b,, where 
b, is some sequence with known asymptotic behavior. When b, = n!, the ordinary 
generating function of a,,/b, is then the exponential generating function of a,. For 
derangements [eqs. (1.1) and (6.7)] this works well, as the exponential generating 
function of d, converges in |z| < t and has a nice form. Unfortunately, while we 
can always find a sequence b, that will make the ordinary generating function 
f(z) of a,/b, converge (even for all z), usually we cannot do it in a way that will 
yield any useful information about f(z). The combinatorial structure of a problem 
almost always severely restricts what forms of generating function can be used to 
take advantage of the special properties of the problem. This difficulty is common, 
for example, in enumeration of labeled graphs. In such cases one often resorts to 
formal power series that do not converge in any neighborhood of the origin. For 
example, if c(n, k) is the number of connected labeled graphs on n vertices with k 
edges, then it is well known (cf. Stanley 1978) that 


See: Kee = log (x =e : (7.2) 


n=0 k=0 m=0 


While the series inside the log in (7.2) does converge for —2 < x < 0, and any y, 
it diverges for any x > Q as long as y # 0, and so this is a relation of formal power 
series. 

There are few methods for dealing with asymptotics of formal power series, at 
least when compared to the wealth of techniques available for studying analytic 
generating functions. Fortunately, combinatorial enumeration problems that do 
require the use of formal power series often involve rapidly growing sequences 
of positive terms, for which some simple techniques apply. We start with an easy 
general result that is applicable both to convergent and purely formal power series. 


Theorem 7.1 (Bender 1974). Suppose that a(z) = > anz" and b(z) = So bnz" are 

power series with radii of convergence a>B 20, respectively. Suppose that 
bn-1/bn > B as n — oo. If a(B) £0, and S~ cnz" = a(z)b(z), then 

Cn ~ aA(B)b, asn— oo. (7.3) 

The proof of Theorem 7.1, which can be found in Bender (1974), is simple. The 

condition a > B is important, and cannot be replaced by a = 8. We can have B = 0, 


and that is indeed the only possibility if the series for b(z) does not converge in a 
neighborhood of z = 0. 
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Example 7.2 (Double set coverings, Bender 1974, Comtet 1968). Let v, be the num- 
ber of choices of subsets S,,...,S, of an n-element set T such that each ¢ € T is 
in exactly two of the S;. There is no restriction on r, the number of subsets, and 
some of the S; can be repeated. Let c, be the corresponding number when the 
S; are required to be distinct. We let C(z) = }> enz"/n!, V(z) = So unz"/n! be the 
exponential generating functions. Then it can be shown that 


C(z) =exp(-1 — (e* —1)/2)A(z) , (7.4) 
V(z) =exp(—1 + (e% — 1)/2)A(z) , (7.5) 
where 
A(z) = J explk(k —1)z/2)/k!. (7.6) 
k:0 


We see immediately that A(z) does not converge in any neighborhood of the 
origin. We have 


an = [2"JA(z) =2" > ea (77) 
k=2 . 


By considering the ratio of consecutive terms in the sum in (7.7), we find that the 
largest term occurs for k = ko with kg log ky ~ 2n, and by the methods of section 5.1 


we find that 
a'/2kh (ky ~ 1)" 
n'/22"(kg — 1)! 


an ~ as —oo. (7.8) 


Therefore a,_\/a, — 0 as n — oo, and Theorem 7.1 tells us that 
Cn ~ Un ~ nla, asin — oo. (7.9) 
& 


Usually formal power series occur in more complicated relations than those 
covered by Theorem 7.1. For example, if f, is the number of connected graphs on 
n labeled vertices which have some property, and F,, is the number of graphs on n 
labeled vertices each of whose connected components has that property, then (cf. 
Wright 1970) 


fee} n oo x” 
1+ hs = exp (a5) : (7.10) 
n=1 : n=1 : 


Theorem 7.3 (Bender 1975). Suppose that 


a(x) = ys a,x" ) F(x,y) = Yagso fanx"y* > 


D(x) = ro bax" = F(x, a(x), D(x) = F(x, a(x)), (7.11) 
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where F,(x,y) is the partial derivative of F(x,y) with respect to y. Assume that 
a, #0 and 


Gi) any = ola) arn -c6, (7.12) 
(ii) s |a,a,_x| = O(an_,) forsomer>0, (7.13) 
k=r . 


(iii) for every & > O there are M(5) and K(5) such that forn > M(5) andh+k > 
r+1, 


firkOnn_wuil < K(6)8""* Jan} - (7.14) 
Then 
r-1 
bn = Yo deay_4 + O(Gn-r) - (7.15) 
k=0 


Condition (iii) of Theorem 7.3 is often hard to verify. Theorem 2 of Bender 
(1975) shows that this condition holds under certain simpler hypotheses. It follows 
from that result that (iii) is valid if F(x, y) is analytic in x and y in a neighbor- 
hood of (0,0). Hence, if F(x, y) = exp(y) or F(x,y) =1+y, then Theorem 7.3 
becomes easy to apply. One can also deduce from Theorem 2 of Bender (1975) 
that Theorem 7.3 applies when (i) and (ii) hold, by = 0, b, > 0, and 


1+a(z) = exp (deeb) ; (7.16) 
k=l 
another relation that is common in graph enumeration (cf. Example 15.1). There 


are also some results weaker than Theorem 7.3 that are easier to apply (Wright 
1967). 


Example 7.4 (Indecomposable permutations, Comtet 1974). For every permutation 
o of {1,...,), let {1,...,2} = Ul, where the J, are the smallest intervals such that 
o(/,) = I, for all h. For example, ao = (134)(2)(56) corresponds to J, = {1,2,3, 4}, 
1, = {5,6}, and the identity permutation has n components. A permutation is said 
to be indecomposable if it has one component. For example, if o has the 2-cycle 
(in), it is indecomposable. Let c, be the number of indecomposable permutations 
of {1,...,2}. Then (Comtet 1974): 


ive] 1 
ze nae Sale =~ (717) 
We apply Theorem 7.3 with a, = n! forn > 1 and F(x, y) = 1—(1+y)~!. We easily 
obtain 
tm ~wnt asn—oo, (7.18) 
so that almost all permutations are indecomposable. ® 


Some further useful expansions for functional inverses and computations of for- 
mal power series have been obtained by Bender and Richmond (1984). 
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8. Elementary estimates for convergent generating functions 


The word “elementary” in the title of this section is a technical term that means the 
proofs do not use complex variables. It does not necessarily imply that the proofs 
are simple. While some, such as those of section 8.4, are easy, others are more 
complicated. The main advantage of elementary methods is that they are much 
easier to use, and since they impose much weaker requirements on the generating 
functions, they are more widely applicable. Usually they only impose condition’ 
on the generating function f(z) for z € R*. 

The main disadvantage of elementary methods is that the estimates they give - 
tend to be much weaker than those derived using analytic function approaches. It 
is easy to explain why that is so by considering the two generating functions 


fi(z) = S02" =(1-z)" (8.1) 
. n=0 
and 
fo(z) = 3/2+ 522% = 3/24+227(1— 27)! (8.2) 


n=1 


Both series converge for |z| < 1 and diverge for {z| > 1, and both blow up as z > 1. 
However, 


file) - fle) = ~ sq 0 aes eers (83) 
Thus these two functions behave almost identically near z = 1. Since f,(z) and 
f:(z) are both ~ (1 — z)"! as z > 1°, z € R*, and their difference is O(|z — 1)) for 
z € Rt, it would require exceptionally delicate methods to detect the differences: . 
in the coefficients of the f;(z) just from their behavior for z € R*. There is a sub- 
stantial difference in the behavior of f,(z) and f,(z) for real z if we let z - —1, so 
our argument does not completely exclude the possibility of obtaining detailed in- . 
formation about the coefficients of these functions using methods of real variables — 
only. However, if we consider the function 


f(z) = 245232" =2+32(1-z3)', (8.4) 


n=1 
then f}(z) and f3(z) are both ~ (1 — zy! as z > 17,z € R*, yet now 
lfi(z) — fa(z)| = O(/z -1]) forallzeER. 


This difference is comparable to what would be obtained by modifying a single 
coefficient of one generating function. To determine how such slight changes in 
the behavior of the generating functions affect the behavior of the coefficients we 
would need to know much more about the functions if we were to use real-variable 
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methods. On the other hand, analytic methods, discussed in section 10 and later, 
are good at dealing with such problems. They require less precise knowledge of 
the behavior of a function on the real line. Instead, they impose weak conditions 
on the function in a wider domain, namely that of the complex numbers. 

For reasons discussed above, elementary methods cannot be expected to produce 
precise estimates of individual coefficients. They often do produce good estimates 


of summatory functions of the coefficients, though. In the examples above, we note 
that 


N 
Sole"Vilz) ~N as N - 00 (8.5) 

a=1 
for 1 <j <3. This holds because the f;(z) have the same behavior as z -+ 17, 
and is part of a more general phenomenon. Good knowledge of the behavior of 
the generating function on the real axis combined with weak restrictions on the 
coefficients often leads to estimates for the summatory function of the coefficients. 
There are cases where elementary methods give precise bounds for individual 
coefficients. Typically when we wish to estimate f,, with ordinary generating func- 


tion f(z) = > f,.z" that converges for |z| <1 but not for |z| > 1, we apply the 
methods of this section to 


8n = fn — fri for n 2 1, 80 =fo (8.6) 
with generating function 

g(2) = > Bn2" = (I — z)f(z) - (8.7) 

n=0 
Then 
n 
Dake = fas (8.8) 
k=0 


and so estimates of the summatory function of the g, yield estimates for f,. The 
difficulty with this approach is that now g(z) and not f(z) has to satisfy the hy- 
potheses of the theorems, which requires more knowledge of the f,. For example, 
most of the Tauberian theorems apply only to power series with nonnegative co- 
efficients. Hence to use the differencing trick above to obtain estimates for f, we 
need to know that f,,_; < f, for all n. In some cases (such as that of fj, = pn, the 
number of ordinary partitions of m) this is easily seen to hold through combinatorial 
arguments. In other situations where one might like to apply elementary methods, 


though, f,_1 < f, is either false or else is hard to prove. When that happens, other 
methods are required to estimate f,,. 


8.1, Simple upper and lower bounds 


A trivial upper bound method turns out to be widely applicable in asymptotic 
enumeration, and is surprisingly powerful. It relies on nothing more than the non- 
negativity of the coefficients of a generating function. 
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Lemma 8.1. Suppose that f(z) is analviie in |z| < R; and that [z"|f(z) > 0 for all 
n > 0. Then for any. x, 0 < x.< R, and any n 20, 


[z"|F(z) <x "f(x) - (8.9) 
Example 8.2 (Lower bound for factorials). Let f (z) =exp(z). Then Lemma 8.1 
yields 

—= [z"Je7 <x" e* (8.10) 
for every x > 0. The logarithm of x~"e* is x —nlogx, and differentiating and 


setting it equal to 0 shows that the minimum value is attained at x = n. Therefore 


+ =[z"Je? <nte", (8.11) 
and so n! > n"e~". This lower bound holds uniformly for all 2, and is off only by 
an asymptotic factor of (27)'/? from Stirling’s formula (4.1). & 


Suppose that f(z) = 3° f.z”. Lemma 8.1 is proved by noting that for 0 <x < R, 
the nth term, f,,x", in the power series expansion of f(x), is < f(x). As we will see in 
section 10, it is often possible to derive a similar bound on the coefficients f, even 
without assuming that they are nonnegative. However, the proof of Lemma 8.1 
shows something more, namely that 


for 4 fic bt fyi + fa SFC) (8.12) 


for 0 <x < R. When x < 1, this yields an upper bound for the summatory function 
of the coefficients. Because (8.12) holds, we see that the bound of Lemma 8.1 
cannot be sharp in general. What is remarkable is that the estimates obtainable 
from that lemma are often not far from best possible. 


Example 8.3 (Upper bound for the partition function). Let p(n) denote the parti- 
tion function. It has the ordinary generating function 


f(z) = op)" = [Ja-z)'. (8.13) 
n=0 k=1 


Let g(s) = log f(e~*), and consider s > 0, s +0. There are extremely accurate 
estimates of g(s). It is known (Andrews 1976, Ayoub 1963), for example, that 


g(s) = 7’ /(6s) + (log s)/2 — (log 27) /2 — 5/24 + O(exp(—41/s)) . 
(8.14) 


If we use (8.14), we find that x~"f(x) is minimized at x = exp(—s) with 


s = 1/(6n)'/? — 1/(4n) + O(n”) , (8.15) 
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which yields 


p(1) + p(2) +--+ p(n) < 2-7/4 en /4(1 + 0(1)) exp(216-1/7n!/2) | 
(8.16) 


Comparing this to the asymptotic formula for the sum that is obtainable from (1.6) 
(see Example 5.2), we see that the bound of (8.16) is too high by a factor of n'/4. 


If we use (8.16) to bound p(n) alone, we obtain a bound that is too large by a 
factor of n°/4, 


The application of Lemma 8.1 outlined above depended on the expansion (8.14), 
which is complicated to derive, involving modular transformation properties of 
p(n) that are beyond the scope of this survey. [See Andrews (1976) or Ayoub 
(1963) for derivations.} Weaker estimates that are still useful are much easier to 


derive. We obtain one such bound here, since the arguments illustrate some of the 
methods from the preceding sections. 
Consider 


fee) 


g(s) = S>- log(1 — e~**) . (8.17) 


k=1 
If we replace the sum by the integral 
I{s) = / —log(t — ev) du , (8.18) 
1 
we find on expanding the logarithm that 


I(s) -| (Seem) du =s' 30m? em. (8.19) 
1 m=) 


m=) 


since the interchange of summation and integration is easy to justify, as all the 
terms are positive. Therefore as s — 0*, 


sI(s) > Sm? =7/6, (8.20) 
m=1 


so that 1(s) ~ w*/(6s) as s — 0*. It remains to show that / is indeed a good ap- 
proximation to g(s). This follows easily from the bound (5.32), since it shows that 


e(s) =1(s) + o( / _ os av) (8.21) 


We could estimate the integral in (8.21) carefully, but we only need rough upper 
bounds for it, so we write it as 
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=f tm [a (8.22) 
=f ae FOX pee ae ee 
5 ef~] 5 
for some constant c. Thus we find that 
g(s) = I(s) + O(log(s')) as s+ 0* . (8.23) 


Combining (8.23) with (8.20) we see that 
g(s)~ 7’ /(6s) ass >0'. (8.24) 


Therefore, choosing s = 1/(6n)'/?, x = exp(—s) in Lemma 8.1, we obtain a bound 
of the form 


p(n) < exp((1 + 0(1))m(2/3)!7n'/?) as n —+ 00 | (8.25) 
w 


Lemma 8.1 yields a lower bound for n! that is only a factor of about n'/? away 
from optimal. That is common. Usually, when the function f(z) is reasonably” 
smooth, the best bound obtainable from Lemma 8.1 will only be off from the 
correct value by a polynomial factor of n, and often only by a factor of n'/2. 

The estimate of Lemma 8.1 can often be improved with some additional knowl- 
edge about the f,. For example, if f,,1 > f, for all n > 0, then we have 

x" f(x) > fant fnsiX + fax? +. 3 fal — xy! : (8.26) 
For f, = p(n), the partition function, this yields an upper bound for p(n) that is 
too large by a factor of n'/4. 

To optimize the bound of Lemma 8.1, one should choose x € (0, R) carefully. 
Usually there is a single best choice. In some pathological cases the optimal choice 
is obtained by letting x — 0* or x > R~. However, usually we have lim,_.p- f(x) = 


oo, and [z’”"]f(z) > 0 for some m with 0 < m < nas well as for some m > n. Under 
these conditions it is easy to see that 


lim x "f(x) = lim x "f(x) = 00. (8.27) 
x—0+ xR 
Thus it does not pay to make x too small or too large. Let us now consider 

8(x) = log(x""f(x)) = log f(x) — nlogx . (8.28) 
Then 

' f n 
Shy 8.29 
gx)=FO)-F, (8.29) 


y 


and the optimal choice must be at a point where g(x) = 0. For most commonly 
encountered functions f(x), there exists a constant x9 > 0 such that 


(5) (x) >0 (8.30) 
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for xy <x < R, and so g”(x) > 0 for all x € (0,R) if n is large enough. For such 
n there is then a unique choice of x that minimizes the bound of Lemma 8.4. 
However, one major advantage of Lemma 8.1 is that its bound holds for all x. To 
- apply this lemma, one can use any x that is convenient to work with. Usually if 
this choice is not too far from the optimal one, the resulting bound is fairly good. 

We have already remarked above that the bound of Lemma 8.1 is usually close 
_ to best possible. It is possible to prove general lower bounds that show this for a 
wide class of functions. The method, originated in Mazo and Odlyzko (1990) and 
developed in Odlyzko (1992), relies on simple elementary arguments. However, 
the lower bounds it produces are substantially weaker than the upper bounds of 
Lemma 8.1. Furthermore, to apply them it is necessary to estimate accurately the 
minimum of x~"f(x), instead of selecting any convenient values of x. A more 
general version of the bound below is given in Odlyzko (1992). 


Theorem 8.4. Suppose that f(x) = > f.x" converges for |x| <1, f, 20 for all n, 
_ Im, > 0 for some mo, and Y- f, = 00. Then for n > mo, there is a unique xy = Xo(n) € 


(0,1) that minimizes x" f(x). Let sy = —\og xo, and 
A= sas log f(e™*) : (8.31) 
Os S=39 
If A > 10° and for all t with 
So <t < sq + 20A7'/? (8.32) 
we have 
& =s ~3 43/2 
aa este yf, < 10°343/2 , (8.33) 
then 
> fe 2 X9"F (x0) exp(—305A'? — 100) . (8.34) 
k=0 


As is usual for Tauberian theorems, Theorem 8.4 only provides bounds on the 
sum of coefficients of f(z). As we mentioned before, this is unavoidable when one 
relies only on information about the behavior of f(z) for z a positive real number. 
The conditions that Theorem 8.4 imposes on the derivatives are usually satisfied 
in combinatorial enumeration applications and are easy to verify. 


Example 8.5 (Lower bound for the partition function). Let f(z) and g(s) be as in 
Example 8.3. We showed there that g(s) satisfies (8.24) and similar rough estimates 
show that g’(s) ~ —7/(6s?), g"(s) ~ 17/(3s), and g(s) ~ —a?/s* as 5 3 0°. 
Therefore the hypotheses of Theorem 8.4 are satisfied, and we obtain a lower 
bound for p(0) + p(1) + --- + p(). If we only use the estimate (8.24) for g(s), then 
we can only conclude that for x = e~*, 


log(x "f(x)) = ns + g(s) ~ns+7°/(6s) ass 0, (8.35) 
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and so the minimum value occurs at s ~ 1/(6n)'/? as n — oo. This only allo 
to conclude that for every e > 0 and n large enough, 


log(p(0) + --- + p(n)) > (1 - e)n(2/3)'?n'? | (8.36) 


However, we can also conclude even without further computations that this lower 
bound will be within a multiplicative factor of exp(cn'/*) of the best upper bound 
that can be obtained from Lemma 8.1 for some c > 0 (and therefore within a 
multiplicative factor of exp(cn'/*) of the correct value). In particular, if we use the 
estimate (8.14) for g(s), we find that for some c’ > 0, 


p(0) +---+ p(n) > exp(1(2/3)'/2n!/? ~ e'n'/4) . (8.37) 
Since p(k) < p(k +1), the quantity on the right-hand side of (8.37) is also a lower 
bound for p(n) if we increase c’, since (n + 1)p(n) > p(0) + --- + p(n). & 


The differencing trick described at the introduction to section 8 could also be 
used to estimate p(n), since Theorem 8.1 can be applied to the generating func- 
tion of p(n + 1) — p(n). However, since the error term is a multiplicative factor of 
exp(cn'/4), it is simpler to use the approach above, which bounds p(n) below by 
(p(0) +--+ p(n))/(n +1). 

Brigham (1950) has proved a general theorem about asymptotics of partition 
functions that can be. derived from Theorem 8.4. [For other results and references 
for partition asymptotics, see Andrews (1976), Ayoub (1963), and Fristedt (1993).] 


Theorem 8.6. Suppose that 


f(z) = TJ — 24) = Sale", (8.38) 
k=1 n=0 
where the b(k) € Z, b(k) > 0 for all k, and that for some C > 0, u > 0, we have 
3 b(k) ~ Cx" (log x)” as x + 00. (8.39) 
k<x 
Then 
log (Sone a(n) ~ un" {Cul (u + 2)g(u + 1) hag 
‘(ut 1) -9)/ D gt/4*) (log m)e/) ( ) 
as m —> co. 


If b(k) =1 for all k, a(n) is pn, the ordinary partition function. If b(k) = k for 
all k, a(n) is the number of plane partitions of n. Thus Brigham’s theorem covers 
a wide class of interesting partition functions. The cost of this generality is that 
we obtain only the asymptotics of the logarithm of the summatory function of 
the partitions being enumerated. [For better estimates of the number of plane 
partitions, for example, see Almkvist (1993), Gordon and Houten (1969), and 
Wright (1931). For ordinary partitions, we have the expansion (1.3). 
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Brigham’s proof of Theorem 8.6 first shows that 
f(e-”) ~ Cw "(log w)’ Pus If(u+1) as w — 0! (8.41) 


and then invokes the Hardy-Ramanujan Tauberian theorem (Rademacher 1937). 
Instead, one can obtain a proof from Theorem 8.4. The advantage of using Theo- 
rem 8.4 is that it is much easier to generalize. Hardy and Ramanujan proved their 
Tauberian theorem only for functions whose growth rates are of the form given by 
(8.41). Their approach can be extended to other functions, but this is complicated 
to do. In contrast, Theorem 8.4 is easy to apply. The conditions of Theorem 8.4 on 
the derivatives are not restrictive. For a function f(z) defined by (8.38) we have 
B = on if $> b(k) = ©, and the condition (8.33) can be shown to hold whenever 
there are constants c, and c) such that for all w > 1, and all sufficiently large m, 


> b(k) < cw? S° b(k) , (8.42) 


k<mw kgm 


say. The main difficulty in applying Theorem 8.4 to generalizations of Brigham’s 
theorem is in accurately estimating the minimal value in Lemma 8.1. 

There are many other applications of Lemma 8.1 and Theorem 8.4. For example, 
they can be used to prove the results of Gardy and Solé (1992) on volumes of 
spheres in the Lee metric. 


Lemma 8.1 can be generalized in a straightforward way to multivariate gener- 
ating functions. If 


fy) = > anax”y" (8.43) 


mn>0 


and a,,, 20 for all m and n, then for any x,y >0 for which the sum in (8.43) 
converges we have 


Onn <x "y "f(x, y) . (8.44) 


Generalizations of the lower bound of Theorem 8.4 to multivariate functions can 
also be derived, but are again harder than the upper bound (Moews 1995). 


8.2. Tauberian theorems 


The Brigham Tauberian theorem for partitions (Brigham 1950), based on the 
Hardy—Ramanujan Tauberian theorem (Rademacher 1937), was quoted above in 
section 8.1. It applies to certain generating functions that have (in notation to 
be introduced in section 10) a large singularity and gives estimates only for the 
logarithm of the summatory function of the coefficients. Another theorem that is 
often more precise, but is again designed to deal with rapidly growing partition 
functions, is that of Ingham (1941), and will be discussed at the end of this section. 
Most of the Tauberian theorems in the literature apply to functions with small 
singularities (i.e., ones that do not grow rapidly as the argument approaches the 
circle of convergence) and give asymptotic relations for the sum of coefficients. 


Asymptotic enumeration methods 1125 


References for Tauberian theorems are Feller (1968, 1971), Ganelius (1971), Hardy 
(1949), Ingham (1941), and Postnikov (1980). Their main advantage is generality 
and ease of use, as is shown by the applications made to 0-1 laws in Compton 
(1987, 1988, 1989). They can often be applied when the information about gen- 
erating functions is insufficient to use the methods of sections 11 and 12. This is 
especially true when the circle inside which the generating function converges is a 
natural boundary beyond which the function cannot be continued. 

One Tauberian theorem that is often used in combinatorial enumeration is that 
of Hardy, Littlewood, and Karamata. We say a function L(t) varies slowly at infinity 
if, for every u > 0, L(ut) ~ L(t) as t ~ 00. : 


Theorem 8.7. Suppose that a, > 0 for all k, and that 
f(x) = Do ax" 
k=0 


converges for0 <x <r. If there is a p > O and a function L(t) that varies slowly at | 
infinity such that 


f(x) ~ (r-xp PL (=) asx—r, (8.45) 
then 

ar ~ (n/r)’L(n)/T(p +1) asn— oo. (8.46) 

k=0 


Example 8.8 (Cycles of permutations, Bender 1974). If S is a set of positive inte- 
gers, and f, the probability that a random permutation on n letters will have all 
cycle lengths in S (i.e., f, = an/n!, where a, is the number of permutations with 
cycle length in S), then 


f(z) = Yo faz” = [exp (e*/&) = (1 ~ 2)" TY] exp(—2*/k) . (8.47) . 
n=0 keS kgs 


If |Z* \ S| < co, then the methods of sections 10.2 and 11 apply easily, and one 
finds that 


fn~exp( ~~ 1/k) as 00. (8.48) 


kgs 


This estimate can also be proved to apply for {Z* \ S| = 00, provided |{1,..., 7} \ 
S| does not grow too rapidly when m — oo. If |S| < oo (or when [{1,...,m}N S| 
does not grow rapidly), the methods of section 12 apply. When S = {1,2}, one 
obtains, for example, the result of Moser and Wyman (1955) that the number of 
permutations of order 2 is 


~ (nf e)?27 7 exp(n'? —1/4) asn— oo. (8.49) 
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[For sharper and more general results, see Moser and Wyman (1955) and Wilf 
(1986).] The methods used in these cases are different from the ones we are con- 
sidering in this section. 

We now consider an intermediate case, with 


lQ,...,m} AS] ~pm asm—oo, (8.50) 


for some fixed p, 0 < p < 1. This case can be handled by Tauberian techniques. 
To apply Theorem 8.7, we need to show that L(t) = f(1 — ¢~')t-? varies slowly at 
infinity. This is equivalent to showing that for any u € (0,1), 


fa—t')<fd—t'uju? ast. (8.51) 


Because of (8.47), it suffices to prove that 


SRN ryt — (1 —t-'u)*} = plogu+o(1) ast— oo, (8.52) 
kes 


but this is easy to deduce from (8.50) using summation by parts (section 5). There- 
fore we find from Theorem 8.7 that 


Def ~ f-I1/n)P(e+ 1) asn— oo. (8.53) 
=0 
[For additional results and references on this problem see Paviov (1992).] RX 


As the above example shows, Tauberian theorems yield estimates under weak 
assumptions. These theorems do have some disadvantages. Not only do they usu- 
ally estimate only the summatory function of the coefficients, but they normally 
give no bounds for the error term. [See Ganelius (1971) for some Tauberian the- 
orems with remainder terms.] Furthermore, they usually apply only to functions 
with nonnegative. coefficients. Sometimes, as in the following theorem of Hardy 
and Littlewood, one can relax the nonnegativity condition slightly. 


Theorem 8.9. Suppose that a, > —c/k for some c > 0, 


f(z) = So agx* , (8.54) 


k=1 


and that f(x) converges for 0 < x <1, and that 


lim f(x) =A. (8.55) 
Then 
lim oa, =A. (8.56) 


k=1 


As DARING: enumeration methods 1127 


Some condition such as a, > —c/k on the a, is necessary, or otherwise the 
theorem would not hold. For example, the function 


fx) = 5 <= 1- 2x4 2x? ---. (8.57) - 


satisfies (8.55) with Az 0, but (8. 56) fails. 
We next present an example that shows an application of the above results in 
combination with other asymptotic methods that were presented before. 


Example 8.10 (Permutations with distinct cycle lengths). The probability that a ran- 
dom permutation on z letters will have cycles of distinct lengths is (z"] f(z), where 


f(z) = Il (1 + *) (8.58) 


Greene and Knuth (1982) note that this is also the limit as p — oo of the proba- 
bility that a polynomial of degree n factors into irreducible polynomials of distinct 
degrees modulo a prime p. It is shown in Greene and Knuth (1982) that 


[2"|f(z) = e7%(1+n7')+ O(n logn) asn— oo, (8.59) 


where y = 0.577... is Euler’s constant. A simplified version of the argument of 
Greene and Knuth (1982) will be presented that shows that 


[z"f(z)~e"% asn—- oo. (8.60) 


Methods for obtaining better expansions, even more precise than that of (8.59), 
are discussed in section 11.2. For related results obtained by probabilistic methods, 
see Arratia and Tavaré (1992b). 

We have, for |z| < 1, 


f(z) = (1+ z)exp ¥ log(1 + (8) 


k=2 


= (1+ z)exp Set /k+ gle) om) 
= (1421 = 2) Nexp(e(a)), 
where 
g(z) = ae way ae (8.62) 
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Since the coefficients of g(z) are small, the double sum in (8.62) converges for 
z =1, and we have 


; 0 oO (-1y""! = 
g(1) =lim g(z) = -1+ Pap 2 +k 
oo, - Eipee 
= —1+ )>flog(1 +k") — k"} (8.63) 
k=2 
= —log2 + lim (log(n + 1) - Hn) = —log2—-y, 


where H, =1+1/2+1/3+---+1/n is the nth harmonic number. Therefore, by 
(8.61), we find from Theorem 8.9 that if f, = [z"] f(z), then 


fotfit---+thhrvne”™ asn—-oo. (8.64) 
To obtain asymptotics of f,, we note that if h, = (z"] exp(g(z)), then by (8.61), 
fn = 2hg + 2h, +--+ + 2h 1 +My (8.65) 


We next obtain an upper bound for |h,|. There are several ways to proceed. The 
method used below gives the best possible result |#,| = O(~?). 

Since g(z) has the power-series expansion (8.62), and A, = [z”] exp(g(z)), com- 
parison of terms in the full expansion of exp(g(z)) and exp(u(z)) shows that 


lAn| < [z"} exp(v(z)), Where u(z) is any power series such that |{z"|g(z)| < [z"Ju(z). 
For n 2 2, 


_ypyn-i m 
t= Sy". (8.66) 


m n 


min 
m22 
men 


The term (m/n)” is monotone decreasing for 1 < m <n/e, since its derivative 
with respect to m is < 0 in that range. Therefore 


a 1 2 ; 1 3 : 2 {2 2 
IR"B@I< x {5} + y= =) tae < 10m”, (8.67) 
3<m<n/3 
say. Hence we can take 


v(z) = 102022" : (8.68) 


n=1 
and then we need to estimate 
wn = [z"] exp(v(z)) . (8.69) 
We let w(z) = exp(v(z)), and note that 


w'(z) = u'(z)w(z) 5 (8.70) 
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so forn 21, 


n-1 


nn = 105 x(n — ky. (8.71) 
k=0 : ; 


Further, since v(1) < oo, and w,, > 0 for all n, we have w, < A = w(1) = exp(v(1)) 
for all n. Let B = 10°A and note that w, < Bn~? for 1 <n < 10°. Suppose now 
_ that w,, < Bm~? for 1 <m <n for some n > 10°. We will prove that w, < Bn-?, - 
and then by induction this inequality will hold for all 2 > 1. We apply eq. (8.70). 
For 0 < k < 100, we use wy <A, (n — ky)! < 2m! For 100 <k < n/2, 


wi(n~—k)' < Bk (n— ky)! < 2Bk?n" , (8.72) ° 
and so 
do wlan — ky! < B4On)' (8.73), 
100<k<n/2 
Finally, 


So omln-ky' <4Bn? SS (n- ky <4Bn7H,. (8.74) 


nf2<k<n-l nf2<k<n-1 
Therefore, by (8.71), 

nw, < 2000An"! + B(4n)"'+4BH,n? < Bn"' , (8.75) 
which completes the induction step and proves that w, < Bn? foralln>1. @ 


There are Tauberian theorems that apply to generating functions with rapidly | 
growing coefficients but are more precise than Brigham’s theorem or the estimates 
obtainable with the methods of section 8.1. One of the most useful is Ingham’s 
Tauberian theorem for partitions (Ingham 1941). The following result is a corollary 
of the more general Theorem 2 of Ingham (1941). 


Theorem 8.11. Let 1 < u; < u2 <... be positive integers such that 


[{uj: uj <x}| = Bx? + R(x) , (8.76) 
where B > 0, B > 0, and 
[1a G2) ar = brogy ++ 0(1) asy— oo. (8.77) 
Let 
a(z) = Jane" = Tic gE. , (8.78) 
n=1 j=l 


a(zj= Sage" = Ta +z"). (8.79) 
ae) j-l 
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m 
So an a (20) 244 = a)!/2 ef Vat) (O41/2\(1—2)-1/2 x(a "(Vm)*) ; 


n=} 


(8.80) 
sia vw (20)? — a)'/22°(V*m)-2/? exp(a'(V*m)*) , (8.81) 
n=1 


where 


a=B(B+1)', V={BBI(B+1)E(B+)}/%, Vr=(1—2F)VBy . 


(8.82) 
Tf u, = 1, then as n — co 
Oy ~ (2) 24 — a)'/2 eV 2-1/2) lO-1/2)01-2)-12 exn(al(Vin)*) , 
(8.83) 
and if 1,2,4,8,... all belong to {uj}, then 
at ~ (2) 1/24 — a)'/229(V*) nt? expan '(V*n)*) . (8.84) 


Theorem 8.11 provides more precise information than Brigham’s Theorem 8.6, 
but under more restrictive conditions. It is derived from Ingham’s main result, 
Theorem 1 of Ingham (1941), which can be applied to wider classes of functions. 
However, that theorem cannot be used to derive Theorem 8.6. The disadvantage 
of Ingham’s main theorem is that it requires knowledge of the behavior of the 
generating function in the complex plane, not just on the real axis. On the other 
hand, the region where this behavior has to be known is much smaller than it is 
for the analytic methods that give more accurate answers, and which are presented 
in sections 10-12. Only behavior of the generating functions II(1 ~ z4/)~! or (1 + 
z's) in an angle jArg|(1 — z)| < 2/2 — & for some & > 0 needs to be controlled. 

Ingham’s paper (1941) contains an extended discussion of the relations between 
different Tauberian theorems and of the necessity for various conditions. 


9. Recurrences 


This section presents some basic methods for handling recurrences. The title is 
slightly misleading, since almost all of this chapter is devoted to methods that are 
useful in this area. Almost all asymptotic estimation problems concern quantities 
that are defined through implicit or explicit recurrences. Furthermore, the most 
common and most effective method of solving recurrences is often to determine 
their generating functions and then apply the methods presented in the other 
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sections. However, there are many recurrences, and those discussed in sections 9.4 
and 9.5 require special methods that do not fit into other sections. These methods 
deserve to be included, so it seems preferable to explain them after treating some 
of the more common types of recurrences, even though those could have been 
covered elsewhere in this chapter. 

Since generating functions are the most powerful tool for handling combinatorial 
recurrences, all the books listed in section 18 that help in dealing with combina- 
torial identities and generating functions are also useful in handling recurrences. 
Methods for recurrences that are not amenable to generating function methods are 
presented in Graham et al. (1989) and Greene and Knuth (1982). Lueker (1980) 
is an introductory survey to some recurrence methods. 

Wimp’s (1984) book is concerned primarily with numerical stability problems in 
computing with recurrences. Such problems are important in computing values of 
orthogonal polynomials, for example, but seldom arise in combinatorial enumer- 
ation. However, there are sections of Wimp (1984) that are relevant to our topic, 
for example to the discussion of differential equations in section 9.2. 


9.1. Linear recurrences with constant coefficients 


The most famous sequence that satisfies a linear recurrence with constant coeffi- 
cients is that of the Fibonacci numbers, defined by Fo = F, = 1, Fax = Fy-1 + Fr-2 
for n > 2. There are many others that are only slightly less well known. Fortu- 
nately, the theory of such sequences is well developed, and from the standpoint 
of asymptotic enumeration their behavior is well understood. [For a survey of 
number-theoretic results, together with a list of many unsolved problems about 
such sequences that arise in that area, see Cerlienco et al. (1987).] There are 
even several different approaches to solving linear recurrences with constant co- 
efficients. The one we emphasize here is that of generating functions, since it fits 
in best with the rest of this chapter. For other approaches, see Milne-Thomson 
(1933) or Norlund (1924), for example. 

Suppose that we have a linear recurrence or a system of recurrences and have 
found that the generating function f(z) we are interested in has the form 


f= a ; (9.1) 


where G(z) and A(z) are polynomials. The basic tool for obtaining asymptotic 


information about {z"]f(z) is the partial fraction expansion of a rational function 
(Henrici 1974-86). Dividing G(z) by A(z) we obtain 


g(z) 

= ohn 9.2 
lz) = ple) + Fe : (9.2) 
where p(z), g(z), and h(z) are all polynomials in z and deg g(z) < degh(z). We 
can assume that h(0) 4 0, since if that were not the case, we would have g(0) = 0 
(as in the opposite case f(z) would not be a power series in z, but would have 
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terms such as z ' or 2°?) and we could cancel a common factor of z from g(z) 
and A(z). Therefore, if d = deg h(z), we can write 


d z mj 
h(z) = h(0) [J (1 ~ =) : (9.3) 
jal 


where z;, 1 < j <d’ are the distinct roots of h(z) = 0, z; has multiplicity m; > 1, 
and }¢m; = 7 Hence we find (Graham et al. 1989, Henrici 1974-86) that for 
certain constants C; x, 


d 


f@)=p(2)+ >> > a a aa yk 


j=l k=I 


=rt)+ Sree (MEAs "at! : (9.4) 


j=l k=1 h=0 


Thus 
dm 
h+k—-l\ _ 
On = "e+ oe (" rae ane (9.5) 
jal k=l 
When m; = 1, 
~8(z;) 
a ; 9.6 
Ci zjh'(z;) ( ) 


and explicit formulas for the c;,; when m; > 1 can also be derived (Graham et al. 
1989), but they are unwieldy and seldom used. 


Example 9.1 (Fibonacci numbers). As was noted in Example 6.3, 


F(z) = Ree 


a T-z-22° 
Now 
h(z) =1—z—2z2? =(1+ 'z)(1 — $2), (9.7) 
where ¢ = (1+5'/?)/2 is the golden ratio. Therefore 
F(z) = = (; 4+.- - iis) . (9.8) 
and for n > 0, 
Fy = (2"|F(z) = 5-'7(6" - (-o)"). (9.9) 
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The partial fraction expansion (9.4) shows that the first-order asymptotics of 
sequence a,, satisfying a linear recurrence of the form (6.30) are determined by the 
smallest zeros of the characteristic polynomial A(z). The full asymptotic expansion 
is given by (9.5), and involves all the zeros. In practice, using (9.5) presents some. 
difficulties, in that multiplicities of zeros are not always easy to determine, and 
the coefficients cj, are often even harder to deal with. Eventually, for large n, 
their influence becomes negligible, but when uniform estimates are required they 
present a problem. In such cases the following theorem is often useful. 


Theorem 9.2. Suppose that f(z) = g(z)/h(z), where g(z) and h(z) are polynomi- ~ 
als, h(O) 4 0, deg g(z) < deg h(z), and that the only zeros of h(z) in |z| < R are 
Piy--+sPk, each of multiplicity 1. Suppose further that 


max |f(z)| < W , (9.10) - 


and that R - |pjl = 6 for some 6 >Oand1 <j <k. Then 


k 


lee) + Se am <WR+5'R"S lo(o)/h'(o)|. 9-11) 
j=l < 


Theorem 9.2 is derived using methods of. complex variables, and a proof is 
sketched in section 10. That section also discusses how to locate all the zeros 
Pi,---,px Of a polynomial A(z) in a disk |z| < R. In general, the zero location 
problem is not a serious one in enumeration problems. Usually there is a single 
positive real zero that is closer to the origin than any other, it can be located 


accurately by simple methods, and R is chosen so that |z| < R encloses only that 
zero. 


Example 9.3 (Sequences with forbidden subblocks). We continue with the prob- 


lem presented in Examples 6.4 and 6.8. Both F,(z) and G,(z) have as denomina- 
tors 


h(z) = z* + (1 —22z)Ca(z) , (9.12) . 


which is a polynomial of degree exactly k. Later, in Example 10.11, we will show 
that for k > 9, h(z) has exactly one zero p in |z| < 0.6, and that for |z| = 0.55, 
|h(z)| > 1/100. Furthermore, by Example 6.7, p — 1/2 as k — oo. On |z| = 0.55, 


|Fa(z)| < 100- (0.55)* . (9.13) 
Theorem 9.2 then shows, for example, that for n > k > ko, 
—na-l 
[z"|Fa(z) + cat < 100(0.55)* "+ 40(0.55) "|F(p)| | 


(9.14) 
<50(0.55)-" , 
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since by Example 6.7, as k — oo, 
h'(p) = kp‘! — 2C4(p) + (1 — 2p)C4(p) ~ -2Ca(p)~-p'. (9.15) 


The estimate (9.14), when combined with the expansions of Example 6.7, gives 
accurate approximations for p,, the probability that A does not appear as a block 
among the first 1 coin tosses. We have 

Pn = 2-"[2"|F (2) 
(9.16) 
= ~2-"Ca(p)p *'(h'(p)) | + O(exp(—0.09n)) . 


We now estimate hA’(p) as before, in (9.15), but more carefully, putting in the 
approximation for p from Example 6.7. We find that 


h'(p) = —p' + O(k2*) , (9.17) 
and 

p-" = 2" exp(~n(2*C4(1/2))~! + O(nk2~*)) . (9.18) 
Therefore 

Pn = exp(—n(2*C4(1/2)) 7! + O(nk27**)) + O(exp(—n/12)) . (9.19) 


This shows that p, has a sharp transition. It is close to 1 for n = 0(2*), and then, 
as n increases through 2*, drops rapidly to 0. (The behavior on the two sides of 
2* is not symmetric, as the drop towards 0 beyond 2* is much faster than the 
increases towards 1 on the other side.) For further results and applications of such 
estimates, see Guibas and Odlyzko (1978, 1980). Estimates such as (9.19) yield 
results sharper than those of Example 6.8. They also prove (see Example 14.1) that 
the expected lengths of the longest run of 0’s in a random sequence of length n is 
log, n + u(log, n) + o(1) as m — 00, where u(x) is a continuous function that is not 
constant and satisfies u(x + 1) = u(x). [See also the discussion of carry propagation 


in Knuth (1981).] For other methods and results in this area, see Arratia et al. 
(1990b). ® 


Inhomogeneous recurrences with constant coefficients, say, 


d 
An = So ciGn-i +by, nod, (9.20) 


are not covered by the techniques discussed above. One can still use the basic- 
generating function approach to derive the ordinary generating function of a, but 
this time it is in terms of the ordinary generating function of bn. If b, does not 
grow too rapidly, the “subtraction of singularities” method of section 10.2 can be 
used to derive the asymptotics of a, in a form similar to that given by (9.26). 
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9.2. Linear recurrences with varying coefficients 


Linear recurrences-with constant coefficients have a nice and complete theory. That 
is no longer the case when one allows coefficients that vary with the index. This is 
not a fault of mathematicians in not working hard enough to derive elegant results, 
but reflects the much more complicated behavior that can occur. The simplest 
case is when the recurrence has.a finite number of terms, and the coefficients are 
polynomials in n. 


Example 9.4 (Two-sided generalized Fibonacci sequences). Let t, be the number 
of integer sequences (b;,...,62,5),1,1,41,@,...,@,) with j+k+2=n in which 
each 8; is the sum of one or more contiguous terms immediately to its right, and 
each a; is likewise the sum of one or more contiguous terms immediately to its 
left. It was shown in Fishburn et al. (1989) that t; = t) = 1 and that 


tat = 2nt, — (n — 1) tet forn>2. (9.21) 
If we let 
ie) 3 tne" (9.22 
77= 24 (n—1) 2 
n=l 


be a modified exponential generating function, then the recurrence (9.21) shows 
that 


t'(z)( — zy* — t(z)(2—-z) =1. (9.23) 
Standard methods for solving ordinary differential equations, together with the 
initial conditions t; = f = 1, then yield the explicit solution 


t(z) = (1 —z)"' exp((1 — z)~') f + [a ~w) 'exp(—(1—w)')dw| , 


(9.24) 


where 
1 
C=e'- i (i — w)"' exp(—(1 — w)!) dw = 0.148495... . (9.25) 
0 


Once the explicit formula (9.24) for t(z) is obtained, the methods of section 12 
give the estimate 


th ~ C(n —1)!(e/7)'” exp(2n')(2n'4)"! asin 00. (9.26) 


It is easy to show that the absolute value of 


(l— a exp((1 —z)”') [a —w)-Texp(—(1 — w)"') dw (9.27) 
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is small for |z| < 1. Therefore the asymptotics of the 4, are determined by the 
behavior of coefficients of 


C(t — z)"'exp((1 — z)"'), / (9.28) 
and that can be obtained easily. The estimate (9.26) then follows. & 


To see just how different the behavior of linear recurrences with polynomial 
coefficients can be from those with constant coefficients, compare the behavior of 
the sequences in Example 9.4 above and Example 9.5 (given below). The existence 
of such differences should not be too surprising, since after all even the first- 
order recurrence @, = na,-; for n > 2, a; = 1, has the obvious solution a, = n!, 
which is not at all like the solutions to constant coefficient-recurrences. However, 
when a, = na,_;, a simple change of variables, namely a, = b,n!, transforms this 
recurrence into the trivial one of b, = b,-; = --: = 6, =1 for all n. Such rescaling 
is among the most fruitful methods for dealing with nonlinear recurrences, even 
though it is seldom as simple as for a, = n!. 


Example 9.4 is typical in that a sequence satisfying a linear recurrence of the 
form 


r 
an = Soci (nan-j 1 ner, (9.29) 
j=l 


where r is fixed and the c;(”) are rational functions (a P-recursive sequence in the 
notation of section 6.3) can always be transformed into a differential equation for 
a generating function. Whether anything can be done with that generating func- 
tion depends strongly on the recurrence and the form of the generating function. 
Example 9.4 is atypical in that there is an explicit solution to the differential equa- 
tion. Further, this explicit solution is a nice analytic function. This is due to the 
special choice of the form of the generating function. An exponential generating 
function seems natural to use in that example, since the recurrence (9.21) shows 
immediately that t, < (2n — 2)(2n — 4)---2 = 2"-'(n — 1)!, and a slightly more in- 
volved induction proves that ¢, grows at least as fast as a factorial. If we tried to 
use an ordinary generating function : 


u(z) = So t2", (9.30) 
n=l 


then the recurrence (9.21) would yield the differential equation 
z4u"(z) + zu'(z) + (1 — 227)u(z) =z - 2, (9.31) 


which is not as tractable. (This was to be expected, since u(z) is only a formal 
power series.) Even when a good choice of generating function does yield an 
analytic function, the differential equation that results may be hard to solve. (One 
can always find a generating function that is analytic, but the structure of the 
problem may not be reflected in the resulting differential equation, and there may 
not be anything nice about it.) 


ee 


ia a. pa I a a 


Asymptotic enumeration methods 1137 


There is an extensive literature on analytic solutions of differential equations 
(cf. Henrici 1974-86, Hille 1969, 1976, Malgrange 1974, Varadarajan 1991, Wasow 
1965), but it is not easy to apply in general. Singularities of analytic functions that 
satisfy linear differential equations with analytic coefficients are usually of only a 
few basic forms, and so the methods of sections 11 and 12 suffice to determine 
the asymptotic behavior of the coefficients. The difficulty is in locating the singu- 
larities and determining their nature. We refer to Hille (1969, 1976), Malgrange 
(1974), Varadarajan (1991), and Wasow (1965) for methods for dealing with this - 
difficulty, since they are involved and so far have been seldom used in combinato- 
rial enumeration. There will be some further discussion of differential equations 
in section 15.3. : 

Some aspects of the theory of linear recurrences with constant coefficients do 
carry over to the case of varying coefficients, even when the coefficients are not 
rational functions. For example, there will in general be r linearly independent 
solutions to the recurrence (9.29) (corresponding to the different starting condi- 
tions). Also, if a solution a, has the property that @,.1/@, tends to a limit @ as 
n — oo, then 1/a is a limit of zeros of 


1— SDejln)z!, (9.32) 
j=l 


and therefore is often a root of 
r 


1- s (tim Cj (n)) zg, (9.33) 


j=l 


Whether there are exactly r linearly independent solutions is a difficult problem. . 
Extensive research was done on this topic in the early 20th century (Adams 1928, 
Batchelder 1927), culminating in the work of Birkhoff and Trjitzinsky (Birkhoff 
1911, 1930, Birkhoff and Trjitzinsky 1932, Trjitzinsky 1933a,b). This work applies to | 
recurrences of the form (9.29) where the c;(n) have Poincaré asymptotic expansions 


cj(n) ~ nkilk fo; 9 + ejan ve + cjgn2/k +---} asn—oo, (9.34) 
where the k; and k are integers and co # 0 if cj(m) is not identically 0 for all n. 
It follows from this work that solutions to the recurrence are expressible as linear 


combinations of elements of the form 
(n')P/4 exp(P (n'/"))n* (logn)" , (9.35) 


where h,m, p, and q are integers, P(z) a polynomial, and a a complex number. 
An exposition of this theory and how it applies to enumeration has been given 
by Wimp and Zeilberger (1985). (There is a slight complication in that most of 
the literature cited above is concerned with recurrences for functions of a real 
argument, not sequences, but this is not a major difficulty.) There is still a problem 
in identifying which linear combination provides the derived solution. Wimp and 
Zeilberger point out that it is usually easy to show that the largest of the terms 
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of the form (9.35) does show up with a nonzero coefficient, and so determines the 
asymptotics of a, up to a multiplicative constant. However, the Birkhoff—Trjitzinsky 
method does not in general provide any techniques for determining that constant. 
The major objection to the use of the Birkhoff—Trjitzinsky results is that they 
, May not be rigorous, since gaps are alleged to exist in the complicated proofs 
(immink 1984, Wimp 1991). Furthermore, in almost all combinatorial enumeration 
applications the coefficients are rational, and so one can use the theory of analytic 
differential equations. 
"When there is no way to avoid linear recurrences with coefficients that vary but 
are not rational, one can sometimes use the work of Kooman (1989, Kooman and 
Tijdeman 1990), which develops the theory of second-order linear recurrences with 
almost-constant coefficients. 


Example 9.5 (An oscillating sequence). Let 


ees (LS. a (9.36) 


Then a, satisfies the linear recurrence 


On42 — (2 - *) Ong, + (: ~ x) a =0, n>ag. (9.37) 


The methods of Kooman and Tijdeman (1990) can be used to show that for some 
constants c and ¢@ 


a, =cn 4 sin(2n'/? +6) +0(n7'4) asn- oo, (9.38) 


which is a much more precise estimate than the crude one mentioned in Exam- 
ple 10.3. 

Another, in some ways preferable, method for obtaining asymptotic expansions 
for a, is mentioned in Example 12.15. It is based on an explicit form for the 
generating function of a,, f(z) = >> a,z". An interchange of orders of summation 
(easily justified for |z| small, say |z| < 1/2) shows that 


k=0 n=k 
oy on ee 2 J ( z ) 

= a= exp . (9.39) 
> kt (@—zyet Tz lz 


The saddle point method can then be applied to obtain asymptotic expansions for 
Qn. id 


tT tt AT ante a LL SE 
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9.3. Linear recurrences in several variables 


Linear recurrences in several variables that have constant coefficients can be at- 
tacked by methods similar to those used for a single variable. If we have 


doa 
ann = y > Cj j4m~in-j (9.40) 


i=0 i=0 
i+p>0 


for m,n > d, say, then the generating function 


f(x,y) = >: SS amnx”y" (9.41) 


m=0 n=0 
satisfies the relation 


d d 


fee) co 
f(x,y) | 1- ee cyt = So SS amnx™y" 
Sears m=0 n=0 
i+j>0 m>d or n>d 
d d 
-5 > cjjx'y! > Oimgyxy" 
i=0 i=0 ae 
i+j>0 or nsd-t 


(9.42) 


If an, = 0 for 0 <m < d and n 2d as well as for 0 <n < d and m > d (so that 
all the a,,,, are fully determined by a,,,, for 0 < m < d,0 <n < d), then f(x, y) is 
a rational function. If this condition does not hold, f(x, y) can be complicated. 

The paragraph above shows that under common conditions, constant-coefficient 
recurrences lead to generating functions that are rational even in several variables. 
However, even when the rational function is determined, there is no equivalent 
of partial fraction decomposition to yield elegant asymptotics of the coefficients. 
Coefficients of multivariate generating functions are much harder to handle than 
those of univariate functions. There are tools (discussed in section 13), that are 
usually adequate to handle rational generating functions, but they are not simple. 

When the coefficients of the multivariate recurrences vary, available knowledge 
is extremely limited. Even if the coefficients are polynomials, we obtain a partial 
differential equation for the generating function. Sometimes there are tricks that 
lead to a simple solution (cf. Example 15.6), but this is not common. 


n 


9.4. Nonlinear recurrences 


Nonlinear recurrences come in a great variety of shapes, and the methods that 
are used to solve them are diverse, depending on the nature of the problem. This 
section presents a sample of the most useful techniques that have been developed. 
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Sometimes a nonlinear recurrence has a simple solution because of a nice al- 


gebraic factorization. For example, suppose that zp is any given complex number, 
and 


Zt =22—-2 forn>0. (9.43) 
If we set 

w = (29 + (25 — 4)'”)/2, (9.44) 
we have zo = w + w-', and more generally 

ZIn=w' +w forn>0. (9.45) 


Equation (9.45) is easily established through induction. Howey this is an excep- 
tional instance, and already recurrences of the type z,,; = 72 + for c a complex 
constant lead to deep questions about the Mandelbrot set and chaotic behavior 
(Devaney 1989). 

Since linear recurrences are well understood, the best that one can hope for 
when confronted with a nonlinear recurrence is that it might be reducible to a 
linear one. This works in many situations. 


Example 9.6 (Planted plane trees). Let a,,, be the number of planted plane trees 


with n nodes and height < h (de Bruijn et al. 1972, Greene and Knuth 1982), and 
let 


An(z) = 0 np" - (9.46) 
n=0 


Since a tree of height < +1 has a root and any number of subtrees, each of 
height <h 


Ans (z) = 2(1 + An(z) + An(z)? + ++) 


=z(1—A,(z))"'. (9.47) 
Iterating this recurrence, we obtain a finite continued fraction that looks like 
Ans (2) = —z . (9.48) 


et a 
i t 


The general theory of continued functions represents a convergent as a quotient 
of two sequences satisfying recurrences involving the partial quotients. (For refer- 
ences, see Jones and Thron 1980, Perron 1957.) After playing with this idea, one 
finds that the substitution 


_ 2Pr(z) ‘ 
A,(z) = Praa(z) (9.49) 


gives 


Phai(Z) = Pa(z)—zPy_(z), h>2, 
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where Po(z) = 0, P;(z) =.1. This is a linear recurrence when we regard z as fixed, 
and so the theory presented before leads to the explicit representation 


f h 
Pyle) = (1 ~ayae| (205) eee, 2 


(9.50) 


De Bruijn et al. (1972) use this representation to determine the average height of © 
plane trees. 


Greene and Knuth (1982, p. 30) note that the continued fraction method of 
replacing a convergent by a quotient of elements of two sequences in general 
leads not to a single sequence of polynomials like the P,(z) of Example 9.6, but to 
two sequences. This is only slightly harder to handle, and allows one to linearize . 
more complicated recurrences. 

There are many additional ways to linearize a recurrence. [A small list is given 
on p. 31 of Greene and Knuth (1982).| For example, a purely multiplicative relation , 
a, = a, /@n-2 is transformed into the linear log a, = 2 log a,_, — log a,_2 by taking 
logarithms. One of the most fruitful tricks of this type is taking inverses. Thus 
Qn = Gn_1/(1 + a,_1) is equivalent to 


ee Se, (9.51) 


an An_} 


which has the obvious solution a,! = a;'!+n. (This assumes ag # —1/k for any _ 
keZ) 

Linearization works well, but is limited in applicability. More widely applicable, 
but producing answers that are not as clear, is approximate linearization, where a. 
given nonlinear recurrence is close to a linear one. The following example combines ° 
approximate linearization with bootstrapping. 


Example 9.7 (A quadratic recurrence). The study of the average height of binary 
trees in Flajolet and Odlyzko (1982) involves the recurrence 


an =4n-{(1—a,-1) forn>1, (9.52) 


with ap = 1/2. The a, are monotone decreasing, so we try the inverse trick. We 
find 


1 te 1 QAn-1 
@n4(1—@y-1) Qn 1— a, — 


i 
an 


(9.53) 
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Iterating this recurrence {but applying it only to the first term on the right-hand 
side of eq. (9.53)] we obtain 


1 1 x = 
cae 424 n-2 2 4n-1 
an An-2 1-a, 2 1— ay.) 
1 n—] a: 
= — + + J (9.54 
ao i Daeg ) 
i=0 
a-t : 
mee i] 
Se ae 
j=9 


Equation (9.54) shows that a,! > n, so a, <1/n. Applying this bound to a; for 
2 <j <n-1 in the sum on the right-hand side of eq. (9.54), we find that 


n<a,' <n+O(logn) . (9.55) 


When we substitute this into (9.54), we find that az! = n+logn+o(logn), and 
further iterations produce even more accurate estimates. 


Approximate linearization also works well for some rapidly growing sequences. 
Example 9.8 (Doubly exponential sequences), Many recurrences are of the form 
Qn = 02 + by , (9.56) 


where b, is much smaller than a2 (and may even depend on the a, for k <n, as 
in b, =a, OF by = a,_1). Aho and Sloane (1973) found that surprisingly simple 
solutions to such recurrences can often be found. The basic idea is to reduce to 
approximate linearization by taking logarithms. We find that if ao is the given initial 
value, and a, > 0 for all n, then the transformation 


Un = logan , (9.57) 

5, =log(1 + bna,”) , (9.58) 
reduces (9.56) to 

Uns, =2Un+6,, ned. (9.59) 
Therefore 


Un = 5,1 + 2Uy_1 = Sy_1 + 28,2 + 4Uy_2 


n 
= > 1 Sn; + 2" uo 
j=l 


= 2" (uy + 5p /2 + 6/4 +--+ + 8,-1/2") . (9.60) 
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If we assume that the 6, are small, then 


oO 
a = ag +) 5,2" (9.61) 
k=0 


exists, and 


In = Un — = 2" YH IE. (9.62) 


k=n 


If the 6, are sufficiently small, the difference r,, in (9.62) will be small, and 
Qn = exp(u,) = exp(2"a@ — rn) . (9.63) 


The expression (9.63) might not seem satisfactory, since both a, and r, are ex- 
pressed in terms of all the a,, for k <n and for k >n. The point of (9.63) is that 
for many recurrences, 7, is negligibly small, while a is given by the rapidly con- 
vergent series (9.61), so that only the first few a, are needed to obtain a good 
estimate for the asymptotic behavior of a,. We next discuss a particularly elegant 
case. 

Suppose that a, > 1 and |b,| < a,/4 for alln > 0. Then a,41 > a, and |5,41| < |6,| 
for n > 0, and so |r,| < |5,|. Hence 


a, exp(—|5,]) < exp(2"@) < a, exp(|5,|) (9.64) 
and since 


exp(|5n]) < 1+ |bala;? < 14+ (4an)~! , 
(9.65) 
exp(—|6,|) = (1+ (4an)~')"! > 1— (Ban)! , 


we find that 
lan — exp(2"a)| < (2an)7' < 1/2. (9.66) 


If a, is an integer, then we can assert that it is the closest integer to exp(2"a). 

The restriction |b,| < a,/4 is severe. The basic method applies even without it, 
and the expansion (9.63) is valid, for example, if we only require that |5,41| < |6,] 
for n > no. However, we will not in general obtain results as nice as (9.66) if we 
only impose these weak conditions. 

The method outlined above can be applied to recurrences that appear to be of 
a slightly different form. Sometimes only a trivial transformation is required. For 
example, Golomb’s nonlinear recurrence, 


Ans] = A0Q,-* an +b, a = 1 ’ y (9.67) 
for b a constant, is easily seen to be equivalent to 


Qnst = (Qn — b)an +b, an =1, ay=b4+1. (9.68) 
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The substitution 


Xn = a, — b/2 (9.69) 


transforms (9.68) into 
Xnvt = x2 + (2 —b)b/4, (9.70) 


which is of the form treated above. [If the x, are integers, the inequality (9.66) 
with x, replacing a, might not apply to the x, because the condition |(2 — b}b/4| < 
\x,|/4 might fail for some k. The trick to use here is to start the recurrence with 
some X,, Say x;,, SO that the condition |(2 — b)b/4| < |x,|/4 applies for k > ko. The 
new a for which (9.66) holds will then be defined in terms of x,,, X4,11,--- | 

In some situations the. results presented above cannot be applied, but the basic 
method can still be extended. That is the case for the recurrence 


Gnst =AnQn1+1, a, a 21 . (9.71) 
of Aho and Sloane (1973). The result is that a, is the nearest integer to 


afepren , (9.72) 


where a and £ are positive constants, and the F, are the Fibonacci numbers. What 
matters is that the recurrence leads to doubly exponential (and regular) growth of 
a,. Example 15.3 shows how this principle can be applied even when the a, are 
not numbers, but polynomials whose coefficients need to be estimated. m 


9.5. Quasi-linear recurrences 


This section mentions some methods and results for studying recurrences that have 
linearity properties, but are not linear. The most important of them are recurrences 
involving minimization or maximization. They arise frequently in problems that 
use dynamic programming approaches and in divide-and-conquer methods. An 
important example, treated in Fredman and Knuth (1974), is that of a sequence 
fa, given by fo =1 and 


Savi = Bast + min (af, + Bfn-k) forn>0, (9.73) 
O<k<n 


where a, B > 0, and g,, is some given sequence. Fredman and Knuth showed that 
if g, = 0 forn >1 anda+ 8 <1, then 


Sn > en'**/7 for some c = c(a, B) > 0, (9.74) 
where y is the solution to 
a Y+BY%=l1. (9.75) 


They proved that lim, ... fan~!~'/7 exists if and only if (log a)/(log B) is irrational. 
They also presented analyses of this recurrence for a + 8 > 1, as well as of several 
recurrences that have different g,. 
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The value of the Fredman—Knuth paper is less in the precise results they obtain 
for several recurrences of the type (9.73) than in the methods they develop, which 
allow one to analyze related problems. A crucial role in their approach is played by 
the observation that for the g, they consider, the minimum in (9.73) can be located 
rather precisely. The conditions for such localization are applicable to many other 
sequences as well. 

Further work on the recurrence (9.73) was done by Kapoor and Reingold (1985), 
who obtained a complete solution under certain conditions. Their solution is com- - 
plicated, expressed in terms of the weighted external path length of a binary tree. 
It is sufficiently explicit, though, to give a complete picture of the continuity, con- 
vexity, and oscillation properties of f,. In some cases their solution simplifies dra- . 
matically. 

Another class of quasi-linear recurrences involves the greatest-integer function. 
Following Erdés et al. (1987), consider recurrences of the form 


s 


a(0)=1, a(n) = S$) na(|n/mj)), n>, (9.76) 


i=l 


where 7; > 0 for ali i, and the m, are integers, m; > 2 for all i. Let tr > 0 be the 
(unique) solution to 


Scrim;* =1. (9.77) 
i=1 


If there is an integer d and integers u; such that m;=d“ for 1<i<s, then 
lim a(n)n-* as n — oo i does not exist, but the limit of a(d*)d~*? as k — oo does . 
exist. If there is no such d, then the limit of a(m)n~* as n — co does exist, and can 
readily be computed. For example, when 


a(n) = a((|n/2|) + a(|n/3|) +a(\n/6|) forn 21, 


this limit is 12(log 432)-!. Convergence to the limit is extremely slow, as is shown 
in Erdds et al. (1987). The method of proof used in that paper is based on renewal 
theory. Several other methods for dealing with recurrences of the type (9.76) are 
mentioned there and in the references listed in that paper. There are connections 
to other recurrences that are linear in two variables, such as 


b(m,n) = b(m,n— 1) + b(m —1,n) + b(m—1,n—1), mn>l. 
(9.78) 


Consider an infinite sequence of integers 2 < a, < a) < --- such that 


co 
Soa;! loga; < oo , 


j=l 
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and define c(0) = 


c(n) = So e(|n/aj))+1, n 21. (9.79) 


j=l 


If p is the (unique) positive solution to 


Datat 


then Erdés (1941) showed that 
c(n)~ cn? asn-+0o (9.80) 


for a positive constant c. Although the recurrence (9.79) is similar to that of 
eq. (9.76), the results are different {no oscillations can occur os a recurrence given 
by eq. (9.79)| and the methods are dissimilar. 

Karp (1991) considers recurrences of the type T(x) = a(x) + T(h(x)), where x is 
a nonnegative real variable, a(x) > 0, and h(x) is a random variable, 0 < h(x) < x, 
with m(x) being the expectation of h(x). Such recurrences arise frequently in the 
analysis of algorithms, and Karp proves several theorems that bound the proba- 
bility that T(x) is large. For example, he obtains the following result. 


Theorem 9.9. Suppose that a(x) is a nondecreasing continuous function that is 


strictly increasing on {x: a(x) > 0}, and m(x) is a continuous function. Then for 
allx ER* andk eZ, 


Prob (T(x) > u(x) + ka(x)) < (m(x)/x)* , 


where u(x) is the unique least nonnegative solution to the equation u(x) = a(x) + 


u(m(x)). 
Another result, proved in Greenberg et al. (1988), is the following estimate. 


Theorem 9.10. Suppose that r,a,,...,ay € R* and that b > 0. For n> N, define 


D+ @n.1 + Ong +> + Ang 


= 9.81 
oe Les k+r Os!) 
Then 
a, ~(n/r)?  asn-+co. (9.82) 


Theorem 9.10 is proved by an involved induction on the behavior of the a,. 


10. Analytic generating functions 


Combinatorialists use recurrence, generating functions, and such trans- 
formations as the Vandermonde convolution; others, to my horror, use 
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contour integrals, differential equations, and other resources of math- 
ematical analysis. 


J. Riordan (1968) 


The use of analytic methods in combinatorics did horrify Riordan. They are 
widespread, though, because of their utility, which even Riordan could not deny. 
About half of this chapter is devoted to such methods, as they are extremely 
flexible and give very precise estimates. 


10.1. Introduction and general estimates 


This section serves as an introduction to most of the remaining sections of the 
paper, which are concerned largely with the use of methods of complex variables in 
asymptotics. Many of the results to be presented later can be used with little or no 
knowledge of analytic functions. However, even some slight knowledge of complex 
analysis is helpful in getting an understanding of the scope and limitations of the 
methods to be discussed. There are many textbooks on analytic functions, such as 
Henrici (1974-86) and Titchmarsh (1939). This chapter assumes that the reader 
has some knowledge of this field, but not a deep one. It reviews the concepts that 
are most relevant in asymptotic enumeration, and how they affect the estimates 
that can be obtained. It is not a general introduction to the subject of complex 
analysis, and the choices of topics, their ordering, and the decision of when to 
include proofs were all made with the goal of illustrating how to use complex 
analysis in asymptotics. 

There are several definitions of analytic functions, all equivalent. For our pur- 
poses, it will be most convenient to call a function f(z) of one complex variable 
analytic in a connected open set S € C if in a small neighborhood of every point 
w € S, f(z) has an expansion as a power series 


f(z) = Sale —w)", Gn = 4n(w), (10.1) 


n=Q 


that converges. Practically all the functions encountered in asymptotic enumeration 
that are analytic are analytic in a disk about the origin. A necessary and sufficient 
condition for f(z), defined by a power series (6.1), to be analytic in a neighborhood 
of the origin is that |a,| << C" for some constant C > 0. Therefore there is an 
effective dichotomy, with common generating functions either not converging near 
0 and being only formal power series, or else converging and being analytic. 

A function f(z) is called meromorphic in S if it is analytic in § except at a 
(countable isolated) subset S’ C S, and in a small neighborhood of every w € S$’, 
f(z) has an expansion of the form 

00 ‘N 
flz)= > ane -w)", an = an(w) - (10.2) 


n=—N(w) 


Thus meromorphic functions can have poles, but nothing more. Alternatively, a 
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function is meromorphic in S if and only if it is the quotient of two functions 
analytic in S. In particular, z~* is meromorphic throughout the complex plane, 
but sin(1/z) is not. In general, functions given by nice expressions are analytic 
away from obvious pathological points, since addition, multiplication, division, and 
composition of analytic functions usually yield analytic or meromorphic functions 
in the proper domains. Thus sin(1/z) is analytic throughout C \ {0}, and so is 
z~>, while exp(1/(1 — z)) is analytic throughout C \ {1}, but is not meromorphic 
because of the essential singularity at z = 1. Not all functions that might seem 
smooth are analytic, though, as neither f(z) = z (z denoting the complex conjugate 
of z) nor f(z) = [z] is analytic anywhere. The smoothness condition imposed by 
(10.1) is very stringent. 

Analytic continuation is an important concept. A function f(z) may be defined 
and analytic in S, but there may be another function g(z) that is analytic in S’ > S 
and such that g(z) = f(z) for z € 5S. In that case we say that g(z) provides an 
analytic continuation of f(z) to S’, and it is a theorem that this extension is unique. 
A simple example is provided by 


oo 1 : 
no ; 10.3 


The power series on the left side converges only for |z| < 1, and defines an analytic 
function there. On the other hand, (1 — z)~' is analytic throughout C \ {1}, and 
so provides an analytic continuation for the power series. This is a common phe- 
nomenon in asymptotic enumeration. Typically a generating function will converge 


in a disk }|z| < r, will have a singularity at r, but will be continuable to a region of 
the form 


{z: |z] <r+6, [Arg(z —r)| > w/2 - &} (10.4) 


for 5, ¢ > 0. When this happens, it can be exploited to provide better or easier 
estimates of the coefficients, as is shown in section 11.1. That section explains the 
reasons why continuation to a region of the form (10.4) is so useful. 

If f(z) is analytic in S, z is on the boundary of S, but f(z) cannot be analytically 
continued to a neighborhood of z, we say that z is a singularity of f(z). Isolated 
singularities that are not poles are called essential, so that z = 1 is an essential 
singularity of exp(1/(1 — z)), but not of 1/(1 — z). (Note that z = 1 is an essential 
singularity of f(z) = (1 — z)'/? even though f(1) = 0.) Throughout the rest of this 
chapter we will often refer to large singularities and small singularities. These are 
not precise concepts, and are meant only to indicate how fast the function f(z) 
grows as Z —> Zo, where zo is a singularity. If zo = 1, we say that (1 — z)'/?, log(1 — 
z), (1 ~z)~" have small singularities, since |f(z)| either decreases or grows at 
most like a negative power of |1 — z| as z — 1. On the other hand, exp(1/(1 — 
z)) or exp((1 — z)~/5) will be said to have large singularities. Note that for z = 
1+iy, y € R, exp(1/(1 — z)) is bounded, so the choice of path along which the 
singularity is approached is important. In determining the size of a singularity 
Zo, we will usually be concerned with real zo and generating functions f(z) with 
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nonnegative coefficients, and then usually will need to look only at z real, z > 
z)- When the function f(z) is entire (that is, analytic throughout C), we will say 
that co is a singularity of f(z) (unless f(z) is a constant), and will use the large 
vs. small singularity classification depending on how. fast f(z) grows as [z| — oo. 
The distinction between small and large singularities is important in asymptotics 
because different methods are used in the two cases. 

A simple closed contour Fin the complex plane is given by a continuous mapping 
y : {0,1} — C with the properties that y(0) = y(1), and that y(s) £ y(t) whenever 
O0<s<¢t<landeithers £0ort # 1. Intuitively, I is a closed path in the complex 
plane that does not intersect itself. For most applications that will be made in this — 
chapter, simple closed contours I will consist of line segments and sections of 
circles. For such contours it is easy to prove that the complex plane is divided 
by the contour into two connected components, the inside and the outside of the - 
curve. This result is true for all simple closed curves by the Jordan curve theorem, 
but this result is surprisingly hard to prove. 

In asymptotic enumeration, the basic result about analytic functions is the 
Cauchy integral formula for their coefficients. , 


Theorem 10.1. If f(z) is analytic in an open set S containing 0, and 


f@)= Janz” (10.5) 


n=0 


in a neighborhood of 0, then for any n > 0, 
an = (e"Ifle) = @niy"" f f(aye™" de, (106) 
r 


where T is any simple closed contour in S§ that contains the origin inside it and is . 
positively oriented (i.e., traversed in counterclockwise direction). 


An obvious question is why should one use the integral formula (10.6) to deter- 
mine the coefficient a, of f(z). After all, the series (10.5) shows that 


d" 
nla, = qf) i . 


F (10.7) 


Unfortunately the differentiation involved in (10.7) is hard to control. Derivatives 
involve taking limits, and so even small changes in a function can produce huge 
changes in derivatives, especially high-order ones. The special properties of analytic 
functions are not reflected in the formula (10.7), and for nonanalytic functions there 
is little that can be done. On the other hand, Cauchy’s integral formula (10.6) 
does use special properties of analytic functions, which allow the determination 
of the coefficients of f(z) from the values of f(z) along any closed path. This 
determination involves integration, so that even coarse information about the size 
of f(z) can be used with it. The analytic methods that will be outlined exploit 
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the freedom of choice of the contour of integration to relate the behavior of the 
coefficients to the behavior of the function near just one or sometimes a few points. 

If the power series (10.5) converges for |z| < R, and for the contour I we choose 
a circle z = rexp(ié), 0< @ < 27, 0<r< R, then the validity of (10.6) is easily 
checked by direct computation, since the power series converges absolutely and 
uniformly so one can interchange integration and summation. The strength of 
Cauchy’s formula is in the freedom to choose the contour I in different ways. 
This freedom yields most of the powerful results to be discussed in the following 

sections, and later in this section we will outline how this is achieved. First we 
" discuss some simple applications of Theorem 10.1 obtained by choosing I to be a 
circle centered at the origin. 


Theorem 10.2. If f(z) is analytic in |z| < R, then for any r withO <r < Rand any 
neZ,n>0, 


I2"1F(z)) <r" mak If(z)] . (10.8) 


The choice of in Theorem 10.1 to be the circle of radius r gives Theorem 10.2. 
If f(z), defined by (10.5), has a, > 0 for all, then 


If) < So anlzl" = f(lzl) 


n=0 


and therefore we obtain Lemma 8.1 as an easy corollary to Theorem 10.2. The 
advantage of Theorem 10.2 over Lemma 8.1 is that there is no requirement that 
a, 2 0. The bound of Theorem 10.2 is usually weaker than the correct value by a 
small multiplicative factor such as n'/?. 

If f(z) is analytic in |z| < R, then for any 6 > 0, f(z) is bounded in |z| < R — 6, 
and so Theorem 10.2 shows that a, = [z”]f(z) satisfies |a,| = O((R — 5)~"). On the 
other hand, if |a,,| = O(S~"), then the power series (10.5) converges for |z| < S and 
defines an analytic function in that disk. Thus we obtain the easy result that if f(z) 
is analytic in a disk |z| < R but in no larger disk, then 


lim sup |a,|'"=R'. (10.9) 


Example 10.3 (Oscillating sequence). Consider the sequence, discussed already in 
Example 9.5, given by 


n _1)k 
m= (A) GP. n=0,1,.... (10.10) 


k=0 


The maximal term in the sum (10.10) is of order roughly exp(cn'/), so a, cannot be 
much larger. However, the sum (10.10) does not show that a, cannot be extremely 
small. Could we have |a,| < exp(—n) for all n, say? That this is impossible is 
obvious from (9.39), though, by the argument above. The generating function 
f(z), given by eq. (9.39), is analytic in |z| <1, but has an essential singularity at 
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z =1, so we immediately see that for any.¢ > 0, |a,| <.(1 + €)" for all sufficiently 
large n, and that |a,| > (1 — «)" for infinitely many 2. (More powerful methods for 
dealing with analytic generating functions, such as the saddle point method to be 
discussed in section 12, can be used to obtain the asymptotic relation for a, given 
in Example 9.5.) Xl 


There is substantial literature dealing with the growth rate of coefficients of an- 
alytic functions. The book of Evgrafov (1961) is a good reference for these results. 
However, the estimates presented there are not too useful for us, since they apply 
to wide classes of often pathological functions. In combinatorial enumeration we 
usually encounter much tamer gencrating functions for which the crude bounds of 
Evgrafov (1961) are obvious or easy to derive. Instead, we need to use the tractable 
nature of the functions we encounter to obtain much more delicate estimates. 

The basic result, derived earlier, is that the power-series coefficients a, of a 
generating function f(z), defined by (10.5), grow in absolute value roughly like 
R, if f(z) is analytic in |z| < R. A basic result about analytic functions says 
that if the Taylor series (10.5) of f(z) converges for |z| < R but for every « > 0 
there is a z with |zZ] = R+e such that the series (10.5) diverges at z, then f(z) 
has a singularity z with |z| = R. Thus the exponential growth rate of the a, is 
determined by the distance from the origin of the nearest singularity of f(z), 
with close singularities giving large coefficients. Sometimes it is not obvious what 
R is. When the coefficients of f(z) are positive, as is common in combinatorial 
enumeration and analysis of algorithms, there is a useful theorem of Pringsheim 
(Titchmarsh 1939): 


Theorem 10.4. Suppose that f(z) is defined by eq. (10.5) with a, > 0 for alln > no, 
and that the series (10.5) for f(z) converges for |z| < R but not for any |z| > R. 
Then z = R is a singularity of f(z). 


As we remarked above, the exponential growth rate of the a, is determined by 
the distance from the origin of the nearest singularity. Theorem 10.4 says that if 
the coefficients a, are nonnegative, it suffices to look along the positive real axis 
to determine the radius of convergence R, which is also the desired distance to the 
singularity. There can be other singularities at the same distance from the origin 
(for example, f(z) = (1 — z?)~! has singularities at z = +1), but Theorem 10.4 
guarantees that none are closer to 0 than the positive real one. 

Since the singularities of smallest absolute value of a generating function exert 
the dominant influence on the asymptotics of the corresponding sequence, they 
are called the dominant singularities. In the most common case there is just one 
dominant singularity, and it is almost always real. However, we will sometimes 
speak of a large set of singularities (such as the & first-order poles in Theorem 9.2, 
which are at different distances from the origin) as dominant ones. This allows 
some dominant singularities to be more influential than others. 

Many techniques, including the elementary methods of section 8, obtain bounds 
for summatory functions of coefficients even when they cannot estimate the indi- 
vidual coefficients. These methods succeed largely because they create a dominant 
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singularity. If f(z) = }- f.z" converges for |z{ < 1, diverges for |zj > 1, and has 
Jn 2 0, then the singularity at z =1 is at least as large as any other. However, 
there could be other singularities on |z| = 1 that are just as large. [This holds for 
the functions f2(z) and f3(z) defined by (8.2) and (8.4).] When we consider the 
generating function of 5°,<,, fx, though, we find that 


oo 


h(z) = oS (x3) z" = (1—z)'f(z), (10.11) 
k=0 


so that h(z) has a singularity at z = 1 that is much larger than any other one. That 
often provides enough of an extra boost to push through the necessary technical 
details of the estimates. 

Most generating functions f(z) have their coefficients a, = (z"|f(z) real. If f(z) 
is analytic at 0, and has real coefficients, then f(z) satisfies the reflection principle, 


fO=SIE): (10.12) 


This implies that zeros and singularities of f(z) come in complex-conjugate pairs. 

The success of analytic methods in asymptotics comes largely from the use of 
Cauchy’s formula (10.6) to estimate accurately the coefficients a,. At a more basic 
level, this success comes because the behavior of an analytic function f(z) reflects 
precisely the behavior of the coefficients a,. In the discussion of elementary meth- 
ods in section 8, we pointed out that the behavior of a generating function for 
real arguments does not distinguish between functions with different coefficients. 
For example, the functions f\(z) and f;(z) defined by (8.1) and (8.4) are almost 
indistinguishable for z € R. However, they differ substantially in their behavior for 
complex z. The function f,(z) has only a first-order pole at z = 1 and no other 
singularities, while f;(z) has poles at z = 1, exp(27i/3), and exp(47i/3). The three 
poles at the three cubic roots of unity reflect the modulo 3 periodicity of the 
coefficients of f3(z). This is a general phenomenon, and in the next section we 
sketch the general principle that underlies it. (The degree to which coefficients of 
an analytic function determine the behavior at the singularities is the subject of 
Abelian theorems. We will not need to delve into this subject to its full depth. For 
references, see Hardy (1949) and Titchmarsh (1939).} 

Analytic methods are extremely powerful, and when they apply, they often yield 
estimates of unparalleled precision. However, there are tricky situations where 
analytic methods seem as if they ought to apply, but do not (at least not easily), 
whereas simpler approaches work. 


Example 10.5 (Set partitions with distinct block sizes). Let a, be the number of 
partitions of a set of n elements into blocks of distinct sizes. Then a, = b,- n!, 
where b, = [z"|f(z), with 


f(z) = Il (a + =) : (10.13) 


k=1 
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The function f(z) is entire and has nonnegative coefficients, so it might appear 
as an ideal candidate for an application of some of the methods for dealing with 
large singularities (such as the saddle point technique) that will be presented later. 
However, on circles |z| = (n+ 1/2)/e, n € Z*, f(z) does not vary much, so theré 
are technical problems in applying these analytic methods. On the other hand, 
combinatorial estimates can be used to show (Knopfmacher et al. 1995) that me 
b, behave in a “regularly irregular” way, so that, for example, 


Din(nst)/2—1 ™~ Prn(mnvi)/2 asim- oo, (10.14) 
Brnimst)/2 ~ MPm(msiy/241 ASM — OO - (10.15) 


These estimates are obtained by expanding the product in eq. (10.13) and noting . 
that 

b, = by : : (10.16) 

ri sky<--<k, Ik! > 


kj=n i=1 


Since factorials grow rapidly, the only terms in the sum in (10.16) that are sig- 
nificant are those with small k;. The term b,z" for n = m(m + 1)/2 for example,. 
comes almost entirely from the product of z*/k!, 1<k <m, all other products 
contributing an asymptotically negligible amount. v9 


10.2. Subtraction of singularities 


An important basic tool in asymptotics of coefficients of analytic functions is that of 
subtraction of singularities. If we wish to estimate [z"|f(z), and we know [z"}g(z), 
and the singularities of f(z) — g(z) are smaller than those of f(z), then we can 
usually conclude that [z"]f(z) ~ [z"]g(z) as n — oo. In practice, given a function 
f(z), we find the dominant singularities of f(z) (usually poles), and construct a sim- 
ple function g(z) with those singularities. We illustrate this approach with several 
examples. The basic theme will recur in other sections. 


Example 10.6 (Bernoulli numbers). The Euler-Maclaurin summation formula, in- 
troduced in section 5.3, involves the Bernoulli numbers B, with exponential gen- 
erating function 


2" z 
f(z) = 2 Ba Tenge (10.17) 
The denominator exp(z) — 1 has zeros at 0, +2q7i, 44i,.... The zero at 0 is 


canceled by the zero of z, so f(z) is analytic for |z| < 21, but has first-order poles 
at z = £2m1, +4, ... . Consider 


; 1 1 > 
g(z) = 2m (5-53 = ai) . (10.18) 
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Then f(z) — g(z) is analytic for |z| < 47, so 


I[z"](F(z) — g(z))| = O((4-£)") asn— co (10.19) 
for every ¢ > 0. On the other hand, 
[z"\g(z) = ne ete (10.20) 


This gives the leading term asymptotics of B,. By taking more complicated g(z), we 
can subtract more of the singularities of f(z) and obtain more accurate expansions 
for B,. It is even possible to obtain an exponentially rapidly convergent series for 
Bn. i 


Example 10.7 (Rational function asymptotics). As another example of the subtrac- 
tion-of-singularities principle, we sketch a proof of Theorem 9.2. Suppose that the 
hypotheses of that theorem are satisfied. Let , 


k 


ee 2 aon z/pj)” G02) 
Then f(z) — u(z) has no singularities in |z| < R, and for |z| = R, 
If(z) — u(z)| < [f(z)] + |u(z)| << W + 8! > Ig(pj)/h'(o,)| - (10.22) 
j=l 
Hence, by Theorem 10.2, 
[z"1f(z) — u(z))| <WR"+5'R™ 5 Ig(o;)/h'(o,)| - (10.23) 
j=l 
On the other hand, 
(z"]u(z) = — 6; "le(pj)/h'(p)) . (10.24) 
j=l 
The last two estimates yield Theorem 9.2. & 


The reader may have noticed that the proof of Theorem 9.2 presented above 


does not depend on f(z) being rational. We have proved the following more gen- 
eral result. 


Theorem 10.8. Suppose that f(z) is meromorphic in an open set containing |z| < R, 
that it is analytic at z = 0 and on |z| = R, and that the only poles of f(z) in |z|< R 
are at pi,..., px, each of multiplicity \. Suppose further that 


er f(z). < W (10.25) 
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and that R — \p;| > & for some & > O and 1 <j <k. Then 


k k 
[2"F(z) + So rips") <WR+EOR" YS Il , (10.26) 


j=1 j=l 


where r; is the residue of f(z) at p;. 


In the examples above, the dominant singularities were separated from other 
ones, so their contributions were larger than those of lower-order terms by an 
exponential factor. Sometimes the singularity that remains after subtraction of the 
dominant one is on the same circle, and only slightly smaller. Section 11 presents 
methods that deal with some cases of this type, at least when the singularity is not 
large. What makes those methods work is the subtraction-of-singularities principle. 
Next we illustrate another application of this principle where the singularity is 
large. (The generating function is entire, and so the singularity is at infinity.) 


Example 10.9 (Permutations without long increasing subsequences). Let u,(n) be 
the number of permutations of {1,2,...,} that have no increasing subsequence 
of length > &k. Logan and Shepp (1977) and Vershik and Kerov (1977) estab- 
lished by calculus of variations and combinatorics that the average value of the 
longest increasing subsequence in a random permutation is asymptotic to 2n!/2, 
Frieze (1991) has proved recently, using probabilistic methods, a stronger result, 
namely that almost all permutations have longest increasing subsequences of length 
close to 2n!/?. Here we consider asymptotics of u,,(n) for k fixed and n — oo. The 
Schensted correspondence and the hook formula express u,(n) in terms of Young 
diagrams with < k columns. For k fixed, there are few diagrams and their influ- 
ence can be estimated explicitly using Stirling’s formula, although Selberg-type 
integrals are involved and the analysis is complicated. This analysis was done by 
Regev (1981), who proved more general results. Here we sketch another approach 
to the asymptotics of u,(n) for k fixed. It is based on a result of Gessel (1990). If 


oo 2n 
U,(z) =~ ie (10.27) 
n=0 . 
then 
U,(z) = det(Ny. (22) )r<ijce > (10.28) 


where the /,,(x) are Bessel functions (chapter 9 of NBS 1970). H. Wilf and the 
author have noted that one can obtain the asymptotics of the u,(n) by using known 


asymptotic results about the /,,(x). Equation (9.7.1) of NBS 1970 states that for 
every HE Z*, 


h=0 


H-1 
Im(Z) = (24z)7'/? e? (= c(m, hz" + 04) ; (10.29) 
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where this expansion is valid for |z] + oo with |Arg(z)| < 32/8, say. The c(m, h) 
are explicit constants with c(m,0) = 1. Let us consider k = 4 to be concrete. Then, 
taking H = 7 in (10.29) (higher values of H are needed for larger &) we find from 
(10.28) that 


Us(z) = e°# (3(25607z*)-! + O(|z|~))_ for jz} > 1. (10.30) 


It is also known that /,,(—z) = (—1)"Jn.(z) and /,,(z) is relatively small in the an- 
gular region |7/2 — Arg(z)| < 7/8. Therefore Us(—z) = U,(z), and one can show 
that 


|Ua(z)| = Olz}-'Ua(lz})) (10.31) 
for z away from the real axis. 


To apply the subtraction-of-singularities principle, we need an entire function 
f(z) that is even, is large only near the real axis, and such that for x € R, x — oo, 


f(x) ~ 3(25617x*) “| exp(8x) . (10.32) 
The function 
f(z) = 3(12877z*)'cosh(8z) 


is even and has the desired asymptotic growth, but is not entire. We correct this 
defect by subtracting the contribution of the pole at z = 0, and let 


f(z) = 31287 z*)-|(cosh(8z) ~ 1 — 322? — 512z4/3 — 163842°/45 
~ 131072z°/315) . (10.33) 


(It is not necessary to know explicitly the first eight terms in the Taylor expansion 
of cash(8z) that we wrote down above, as they do not affect the final answer.) 
With this definition 


\Ua(z) — f(z)| = Oz Fz) (10.34) 


uniformly for all z with |z] > 1, say, and so if we apply Cauchy’s theorem on the 
circle |z| = n/4, say, we find that 


(27"|(Us(z) — f(z)) = O01 ** e216") (10.35) 


(The choice of |z| = 2/4 is made to minimize the resulting estimate.) On the other 
hand, by Stirling’s formula, 


[2"")F(z) = 3(12807)? - ([z"**]cosh(8z)) 
= 3(12897) “'82"*8 /(2n + 8)! 
~ 15360 Pn 16" en 7/2? asn oo. (10.36) 
Comparing (10.35) and (10.36), we see that 
ug(n) = (nl)? [z"]Us(z) ~ (n8)?15360 9/2216" 2 17/2 
~ 15360-7/2n-15/216" asin + 00. (10.37) 
g 
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Other methods can be applied to Gessel’s generating function to obtain asymp- 
totics of u,(n) for wider ranges of k (Odlyzko et al. 1995). 

The above example obtains a good estimate because the remainder term in 
(10.30) is smaller than the main term by a factor of |z|~!. Had it been smaller 
only by a factor of |z|~'/?, the resulting estimate would have been worthless, and it 
would have been necessary to obtain a fuller asymptotic expansion of U,4(z) or else 
use smoothness properties of the remainder term. This is due to the phenomenon, 
mentioned before, that crude absolute-value estimates in either Cauchy’s theoreti, 
or the elementary approaches of section 8, usually lose a factor of n'/2 when: 
estimating the nth coefficient. 

The subtraction-of-singularities principle can be applied even when the gener- 
ating functions seem to be more complicated than those of Example 10.9. If we 
consider the problem of that example, but with k = 5, then we find that 


Us(z) = 3exp(10z)(5 - 2! - 5/2z25/?) “1 + O(jz|-')) (10.38) 
as |z| — 00, with |Arg(z)| < 30/8, Us(—z) = Us(z), and U;(z) is entire. We now | 


need an entire function with known coefficients that grows as exp(10z)z~25/2. This 
is not difficult to obtain, as 


12 
1p(10z)z-? ~ So jz (10.39) 
j=l 


for suitable coefficients c; has the desired properties. 


10.3. The residue theorem and sums as integrals 


Sometimes sums that are not easily handled by other methods can be converted 
to integrals that can be evaluated explicitly or estimated by the residue theorem. - 
If ¢(z) is a meromorphic function that has first-order poles at z = a, a+1,...,b, 
with a € Z, each with residue 1, then 


b a 
1 
he) = 35 [feveyae, (10.40) 
where I" is a simple closed contour enclosing a,a+1,...,b, provided f(z) is analytic 
inside I and ¢(z) has no singularities inside IF aside from the first-order poles at 
a,a+1,...,b. If t(z) is chosen to have residue (—1)" at z =n, then we obtain 
3 1 
ae f(r) = 5 Fl2)t(z) dz - (10.41) 
A useful example is given by the formula bs 
".(n (-1)"n! i f(z) dz 
—1) f(k) = , 10.42 
> (Qerrw-SP fates (10.42) 
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The advantage of (10.40) and (10.41) is that the integrals can often be manipu- 
lated to give good estimates. This is especially valuable for alternating sums such 
as (10.41). An analytic function f(z) is extremely regular, so a sum such as that in 
(10.40) can often be estimated by methods such as the Euler~Maclaurin summation 

-formula (section 5.3). However, that formula cannot always be applied to alternat- 

ing sums such as that of (10.41), because the sign change destroys the regularity of 
f(n). (However, as is noted in section 5.3, there are generalizations of the Euler— 
Maclaurin formula that are sometimes useful.) It is hard to write down general 
rules for applying this method, as most situations require appropriate choice of 
t(z) and careful handling of the integral. For a detailed discussion of this method, 
often referred to as Rice’s method, see section 4.9 of Henrici (1974-86). A pair of 
popular functions to use as f(z) are 


t)(z) = w/(sin mz), 6(z) = m/(tan az) . (10.43) 


One can show (Theorem 4.9a of Henrici 1974-86) that if r(z) = p(z)/q(z) with 


p(z) and q(z) polynomials such that deg q(z) > deg p(z) + 2, and q(n) 4 Q for any 
n€ Z, then . 


Y= r(n) = — S> Res(r(z)i(z)) , (10.44) 
> Cra) = — YS Res(r(z)2(z)) 5 (10.45) 


where the sums on the right-hand sides above are over the zeros of q(z). 
Examples of applications of these methods to asymptotics of data structures are 
given in Flajolet and Sedgewick (1986) and Szpankowski (1988). 


10.4. Location of singularities, Rouché’s theorem, and unimodality 


A recurrent but only implicit theme throughout the discussion in this section is 
that of isolation of zeros. For example, to apply Theorem 9.2 we need to know that 
the polynomial h(z) has only & zeros, each of multiplicity one, in |z| < R. Proofs 
of such results can often be obtained with the help of Rouché’s theorem (Henrici 
1974-86, Titchmarsh 1939). 


Theorem 10.10. Suppose that f,(z) and f(z) are functions that are analytic inside 
and on the boundary of a simple closed contour I. If 


f(z) < lfi(z)| forallzeTr, (10.46) 
then f,(z) and f\(z) + fo(z) have the same number of zeros (counted with multiplic- 
ity) inside I. 


Example 10.11 (Sequences with forbidden subblocks). We consider again the topic 
of Examples 6.4, 6.8, and 9.3, and prove the results that have already been used 
in Example 9.3. We again set 


h(z) = z* + (1 —2z)Ca(z) , (10.47) 
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where the only fact about C,(z) we will use is that it is a polynomial of degree 
< k and coefficients 0 and 1,-and C,(0) = 1. We wish to show that A(z) has only 
one zero in |z| < 0.6 if k is large. Write 


Is, is, 
Caz) = 145 2 9 26! , (108) 
where «; = +1 for each j. Then 


Ca(z) = ris +u(z) , (10.49) 


where 


_ tet 
Oks ala)” 


For |z| =r <1, we have |u(z)| < r/(2(1 — r)). On the other hand, z — (2 — z)/(1 - 
z) maps circles to circles, since it is a fractional linear transformation, so it takes 
the circle |z| =r to the circle with center on the real axis that goes through the 
two points (2 —r)/(1 — r) and (2+7r)/(1 +r). Therefore for |z| =r <1, 


2+r r l-r-r 
BS ee ey ‘ 
\Ca(z)| > aan ~ doy Ta ae (10.50) 
and so |C,(z)| > 1/16 for |z] =r <.0.6. Hence, if k > 9, then on |z| = 
\(1 — 2z)C,4(z)} > 1/80 > (0.6) , * (10.51) 


and thus (1 — 2z)C,(z) and h(z) have the same number of zeros in |z| < 0.6. On 
the other hand, C,(z) has no zeros in |z{ < 0.6 by (10.50), while 1 — 2z has one, so 
we obtain the desired result, at least for k > 9. (A more careful analysis shows that 
h(z) has only one root inside |z| = 0.6 even for 4 << k < 9. For1 <k <3, there are 
cases where there is no zero inside |z| < 0.6.) Example 6.7 shows how to obtain 
precise estimates of the single zero. 

We note that (10.50) shows that for |z| = 0.55, k >9 


\h(z)| > |1 — 1.1]0.2 — (0.55)* > 0.02 — 0.01 > 1/100 , (10.52) 


a result that was used in Example 9.3. re! 


Example 10.12 (Coins in a fountain). An (n,k) fountain is an arrangement of n 
coins in rows such that there are k coins in the bottom row, and such that each 
coin in a higher row touches exactly two coins in the next lower row. Let a, be 
the number of (7, k) fountains, and a, = )>, a,,. the total number of fountains of 
n coins. The values of a, for 1 <n < 6 are 1,1,2,3,5,9. If we let ag = 1 then it 
can be shown (Odlyzko and Wilf 1988) that 


I 


f(z) = Yan" = ———.. (10.53) 
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This is a famous continued fraction of Ramanujan. {Other combinatorial interpre- 
tations of this continued fraction are also known, see the references in Odlyzko 
and Wilf (1988). For related results, see Privman and Svrakic (1988, 1989).) Al- 
though one can derive the asymptotics of the a, from the expansion (10.53), it is 
more convenient to work with another expansion, known from previous studies of 
Ramanujan’s continued fraction: 


fe) = (10.54) 
where 
F zie+l) 
pla) = di NW a=pa Booey’ (10.55) 
q(z)= a 1)’ foe . (10.56) 


Clearly both p(z) and q(z) are analytic in |z| < 1, so f(z) is meromorphic there. 
We will show that g(z) has a simple real zero x9, 0.57 < xo < 0.58, and no other 
zeros in |z| < 0.62, while p(x) > 0. It will then follow from Theorem 10.8 that 


Qn = XQ" + O((5/3)")  asn-— oo, (10.57) 
where c = —p(x9)/(xoq’(xo)). Numerical computation shows that c = 0.312 36.. 
Xo = 0.576 148 769 . 


To establish the claim about Xo, let p,(z) and qg,(z) denote the nth partial sums of 
the series (10.55) and (10.56), respectively. Write a(z) = q3(z)(1 — z)(1 — z”)/(1 - 
2°), so that 


a(z) =1 —2z — 274.23 43244 725-278-2729, (10.58) 


and consider 


9 
b(z) =| [@ -2z)), 


j=l 


where the z; are 0.57577, —0.469 97 + 10.817 92, 0.748 33 + 10.075 23, —1.059 26 + 
i0.367 18, 0.493 01 + i1.581 85, in that order. (The z; are approximations to the zeros 
of a(z), obtained from numerical library subroutines. How they were derived is not 
important for the verification of our proof.) An easy hand or machine computation 
shows that if a(z) = D>, a,z*, b(z) = Y> by, z*, then 


SY lag — be] < 1.7 107%, 


and so fa(z) — b(z)} < 1.7 x 10~ for ali |z] < 1. Another computation shows that 
|b(z)| 2 8 x 107 for all |z] = 0.62. 
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On the other hand, for 0 < « <-0.62 and |z| = u, we have for k >5 


y2ktl uw 


zeny? : 
Sqoumt Sq a8 Se 


1 zk 


Therefore 


2 


sy! a 
k=4 IE_4(1 ~ 2/) 


416 w \™ ; 
<“— yi(-“.) <6x10-*, (10.60 
alae) ( ) 


and so by Rouché’s theorem, g(z) and b(z) have the same number of zeros in 
\z| < 0.62, namely 1. Since g(z) has real coefficients, its zero is real. This establishes - 
the existence of x9. An easy computation shows that g(0.57) > 0, g(0.58) < 0, so 
0.57 < x9 < 0.58. 

To show that p(x) > 0, note that successive summands in (10.55) decrease in 
absolute magnitude for each fixed real z > 0, and p(z) > 1 — z7/(1—z) >0 for ~ 
0<z<06. & 


The method used in the above example is widely applicable to generating func- 
tions given by continued fractions. Typically they are meromorphic in a disk cen- 
tered at the origin, with a single dominant pole of order 1. Usually there is no 
convenient representation of the form (10.54) with explicit p(z) and q(z), and one 
has to work harder to establish the necessary properties about location of poles. 

It was mentioned in section 6.4 that unimodality of a sequence is often deduced 
from information about the zeros of the associated polynomial. If the zeros of the 
polynomial 


n 
A(z) = So anz* 
k=0 


are real and < 0, then the a, are unimodal, and even the a(t)" are log-concave. ~ 
However, weaker properties follow from weaker assumptions on the zeros. If all 
the zeros of A(z) ate in the wedge-shaped region centered on the negative real axis 
|Arg(—z)| < 7/4, and the a, are real, then the a, are log-concave, but the a,(; i 
are not necessarily log-concave. (This follows by factoring A(z) into polynomials 
with real coefficients that are either linear or quadratic, and noting that all have log- 
concave coefficients, so their product does too.) One can prove other results that 
allow zeros to lie in larger regions, but then it is necessary to impose restrictions 
on ratios of their distances from the origin. 


10.5. Implicit functions s 


Section 6.2 presented functions, such as f( "(z), that are defined implicitly. In this 
section we consider related problems that arise when a generating function f(z) 
satisfies a functional equation f(z) = G(z, f(z)). Such equations arise frequently in 
graphical enumeration, and there is a standard procedure invented by Polya and 
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developed by Otter that is almost algorithmic (Harary and Palmer 1973, Harary 
et al. 1975) and routinely leads to them. Typically G(z, w) is analytic in z and w 
in a small neighborhood of (0,0). Zeros of analytic functions in more than one 
dimension are not isolated, and by the implicit function theorem G(z, w) = w is 
solvable for w as a function of z, except for those points where 


Gy(Z,w) = 5G, w)=1. (10.61) 


Usually for z in a small neighborhood of 0 the solution w of G(z,w) = w will not 
satisfy (10.61), and so w will be analytic in that neighborhood. As we enlarge the 
neighborhood under consideration, though, a simultaneous solution to G(z, w) = w 
and (10.61) will eventually appear, and will usually be the dominant singularity of 
f(z) = w(z). The following theorem covers many common enumeration problems. 


Theorem 10.13. Suppose that 


f2—) =o faz" (10.62) 
n=} 
is analytic at z = 0, that f, > 0 for all n, and that f(z) = G(z, f(z)), where 
G(z,w) = S2 gmnz™w" . (10.63) 
mn20 : 


Suppose that there exist real numbers 6,r,s > 0 such that 

(i) G(z,w) is analytic in |z| <7r+6 and |w| <s+6, 

(ii) G(r, s) = 5, Gy(r,s) =1, 

(iii) G,(r,s) £0 and G,, (r,s) # 0. 
Suppose that nn, € R* U {0} for all m and n, goo =0, go: =1, and gmn > 0 for 
some m and some n 2 2. Assume further that there exist h > j >i > 1 such that 
Si fifj #O while the greatest common divisor of j —i and h-i is 1. Then f(z) 
converges at z =r, f(r) =s, and 


Sn = (2"f (2) ~ (G27, 8)/(20Gyy(r, 8)))Pn Pr" asin — oc. 
(10.64) 


Example 10.14 (Rooted labeled trees), As was shown in Example 6.1, the expo- 
nential generating function ¢(z) of rooted labeled trees satisfies ¢(z) = z exp(¢(z)). 
Thus we have G(z, w) = z exp(w), and Theorem 10.13 is easily seen to apply with 
r=e~', s=1. Therefore we obtain the asymptotic estimate 


tn/nt = [z"t(z) ~ (20) ne" asn— 0. (10.65) 


On the other hand, from Example 6.6 we know that ¢, = n"-! a much more sat- 


isfactory answer, so that the estimate (10.65) only provides us with another proof 
of Stirling’s formula. - & 
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The example above involves an extremely simple application of Theorem 10.13. 
More complicated cases will be presented in section 15.1. 

The statement of Theorem 10.13 is long, and the hypotheses stringent. All that 
is really needed for the asymptotic relation (10.64) to hold is that f(z) should be 
analytic on {z: |z| <r,z #r} and that 


f(z) = er — 2)" + ofr — 2") (10.66) 


for |z —r| < e, |Arg(r — z)| > 1/2 — ¢ for some e > 0. If these conditions are sat- 
isfied, then (10.64) follows immediately from either the transfer theorems of sec- 
tion 11.1 or (with stronger hypotheses) from Darboux’s method of section 11.2. 
The purpose of Theorem 10.13 is to present a general theorem that guarantees 
(10.66) holds, is widely applicable, and is stated to the maximum extent possible 
in terms of conditions on the coefficients of f(z) and G(z, w). 

Theorem 10.13 is based on Theorem 5 of Bender (1974) and Theorem 1 
of Meir and Moon (1989). The hypotheses of Bender’s Theorem 5 are sim- 
pler than those of Theorem 10.13, but, as was pointed out by Canfield (1984), 
the proof is faulty and there are counterexamples to the claims of that the- 
orem. The difficulty is that Theorem 5 of Bender (1974) does not distinguish 
adequately between the different solutions w = w(z) of w = G(z,w), and the 
singularity of the combinatorially significant solution may not be the small- 
est among all singularities of all solutions. The result of Meir and Moon 
(1989) provides conditions that assure such pathological behavior does not oc- 
cur. [The statement of Theorem 10.13 incorporates some corrections to The- 
orem 1 of Meir and Moon (1989) provided by the authors of that paper] It 
would be desirable to prove results like (10.64) under a simpler set of condi- 
tions. 


In many problems the function G(z, w) is of the form 
G(z,w) = B(z)b(w) + h(z) , (10.67) 


where g(z), @(w), and h(z) are analytic at 0. For this case Meir and Moon (1989) 
have proved a useful result (their Theorem 2) that implies an asymptotic estimate 
of the type (10.64). The hypotheses of that result are often easier to verify than 
those of Theorem 10.13 above. [As was noted by Meir and Moon (1989), the last 
part of their conditions (4.12a) has to be replaced by the condition that y; > hj, 
yj > hj, and y, > h, for some k > j >i > 1 with ged(j — i,k — i) = 11] 

Whenever Theorem 10.13 applies, f, = [z”|f(z) equals the quantity on the right- 
hand side of (10.64) to within a multiplicative factor of 1 + O(n~'). One can derive 
fuller expansions for the ratio when needed. 


LL. Small singularities of analytic functions 


In most combinatorial enumeration applications, the generating function has a 
single dominant singularity. The methods used to extract asymptotic information 
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about coefficients split naturally into two main classes, depending on whether this 
singularity is large or small. 

In some situations the same generating function can be said to have either a 
large or a small singularity, depending on the range of coefficients that we are 
interested in. This is illustrated by the following example. 


Example 11.1 (Partitions with bounded part sizes). Let p(n,m) be the number of 
(unordered) partitions of an integer n into integers < m. It is easy to see that 


Pm(z) = > p(n,m)z" = TJ - 2) '. (11.1) 
k=l 


n=0 


The function P,,(z) is rational, but has to be treated in different ways depending 
on the relationship of n and m. If n is large compared to m, it turns out to be 
appropriate to say that P,,(z) has a small singularity, and use methods designed 
for this type of problems. However, if n is not too large compared to m, then the 
singularity of P,,(z) can be said to be large. (Since the largest part in a partition 
of n is almost always O(n'/? logn) (Erdés and Lehner 1941), p(n, m) ~ p(n) if m 
is much larger than n'/? log n,] 

Although P,,(z) has singularities at all the kth roots of unity for allk < m,z=1 
is clearly the dominant singularity, as |P,,(r)| grows much faster as r > 1~ than 
|Pm(z)| for z = rexp(i@) for any 6 € (0,271). If m is fixed, then the partial function 
decomposition can be used to obtain the asymptotics of p(z,m) as m — oo. We 
cannot use Theorem 9.2 directly, since the pole of P,,(z) at z = 1 has multiplicity 
1. However, either by using the generalizations of Theorem 9.2 that are mentioned 


in section 9.1, or by the subtraction-of-singularities principle, we can show that for 
any fixed m, 


m 1 
p(n,m) ~ (2"] (it«) az" 


k=} 


m -1 
(ft) ((m~1)!)' asn—>oo. (11.2) 


=1-} 


Ve 
[See Ayoub (1963) for further details and estimates.] This approach can be ex- 
tended for m growing slowly with n, and it can be shown without much effort that 
the estimate (11.2) holds for n — 00, m < loglogn, say. However, for larger values 
of m this approach becomes cumbersome, and other methods, such as those of 
section 12, are necessary. = 


11.1. Transfer theorems 


This section presents some results, drawn from Flajolet and Odlyzko (1990b), that 
allow one to translate an asymptotic expansion of a generating function around its 
dominant singularity into an asymptotic expansion for the coefficients in a direct 
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way. These results are useful in combinatorial enumeration, since the conditions 
for validity are frequently satisfied. The proofs, which we.do not present here, arc 
based on the subtraction-of-singularities principle, but are more involved than the 
cases treated in section 10.2. 


We start out with an application of the results to be presented later in this’ 
section. 


. 


Example 11.2 (2-regular graphs). The generating function for 2-regular graphs is 
known (Comtet 1974) to be 


f@)=(1- 2) Pexp(—52 - i”) , (11.3) 


[A simpler proof can be obtained from the exponential formula, cf. eq. (3.9.1) of - 
Wilf (1990).] We see that f(z) is analytic throughout the complex plane except for 
the slit along the real axis from 1 to oo, and that near z = 1 it has the asymptotic 
expansion 


f(z) = e3/4 {a —z) 24 (1—z)'? + ill —zjl+.. ; ; (11.4) 


Theorem 11.4 below then shows that as n — 00, 


["|f(z) ~ e34 { ( ig) F 6 a F : « Ae a } 
~ Se ro Beet, (115) 
rs 


The basic transfer results will be presented for generating functions that have a 
single dominant singularity, but can be extended substantially beyond their circle 
of convergence. For r, 7 > 0, and 0 < ¢ < 1/2, we define the closed domain A = 
A(r, &, n) by 


A(r,d,n) = {z: lz] <r+n, |Arg(z —r)| 2 6}. (11.6) 


In the main result below we will assume that a generating function is analytic 
throughout A \ {r}. Later in this section we will mention some results that dispense 
with this requirement. We will also explain why analyticity throughout A \ {r} is 
helpful in obtaining results such as those of Theorem 11.4 below. 

One advantage to using Cauchy’s theorem to recover information about coeffi- 
cients of generating functions is that it allows one to prove the intuitively obvious 
result that small smooth changes in the generating function correspond to small 
smooth changes in the coefficients. We will use the quantitative notion of a func- 
tion of slow variation at oo to describe those functions for which this notion can 
be made precise. (With more effort one can prove that the same results hold with 
a less restrictive definition than that below.) 
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Definition 11.3. A function L(u) is of slow variation at oo if 
(i) There exist real numbers up and dy with up > 0, 0 < dp < 1/2, such that 
L(u) is analytic and 4 0 in the domain 


{u: |Arg(u —uo)| <m~ dp} - (11.7) 


(ii) There exists a function (x), defined for x > 0 with lim,_,.. e(x) = 0, such 
that for all 6 € [—(a — do), 7 — do] and u > up, we have 


L(ue'*) 


Lu) > 7 < e(u) (11.8) 
and 
L(u log? u) 
~ Lu) -il< e(u) . (11.9) 


Theorem 11.4. Assume that f(z) is analytic throughout the domain A \ {r}, where 
A= A(r,¢, 9), 7,n > 0,0 < ¢ < 1/2, and that L(u) is a function of slow variation 
at oo. If a is any real number, then 
(A) if 
f(z) — of ras r)*L (—)) 
uniformly for z € A\ {r}, then 
[z"|f(z) = O(r-"n- 7"! L(n)) asn— oo. 
(B) if 


f= o(( —r)°L (+) 


uniformly as z > r for z € A\ {r}, then 
[z"|f(z) = o(r-"n-*"'L(n)) asin oo. 
(C) Ifa ¢ {0,1,2,...} and 
fle) ~ (r= 2) (= 
uniformly as z > r for z € A\ {r}, then 


2 r-ty-e? 
[2"|F(z) ~ F(a) L{n) . 

The restriction that there be only one singularity on the circle of convergence 
is easy to relax. If there are several (corresponding to oscillatory behavior of the 
coefficients), their contributions to the coefficients add. The crucial fact is that 
at each singularity the function f(z) should be continuous except for an angular 
region similar to that of A(r, ¢, 7). 

The requirement that the generating function f(z) be analytic in the interior of 
A(r, @, 7) is in general harder to dispense with, at least by the methods of Flajolet 
and Odlyzko (1990b). However, if the singularity at r is sufficiently large, one 
can obtain the same results with weaker assumptions that only require analyticity 
inside the disk |z| <r. The following result is implicit in Flajolet and Odlyzko 
(1990b). 
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Theorem 11.5. Assume that f(z) is analytic in the domain{z: \z| <r,z #1} and 
that L(u) is a function of slow variation at ov. If a is any fixed real number with 
a < —1, then the implications (A), (B), and (C) of Theorem 11.4 are valid. 


Example 11.6 (Longest cycle in a random permutation). The average length of 
the longest cycle in a permutation on n letters is (z"|f(z), where 


f)=A-2" DD [ — exp (-¥ i!) 


k>0 j2k 


It is easy to see that f(z) is analytic in {z{ < 1, and a double application of the 
Euler-Maclaurin summation formula shows that f(z) ~ G(1 — z)~* as z — 1, uni- 
formly for |z| < 1, z #1, where 


Ge f — exp Gi tle ar) | ax = 0.624.... (11.10) 
0 x 


Therefore, by Theorem 11.5 with L(u) = 1, 
[2"]f(z)~ Gn asn—oo, (11.11) 


a result first proved by Shepp and Lloyd (1966) using Poisson approximations and 
Tauberian theorems. The derivation sketched above follows Flajolet and Odlyzko 
(1990a,b). Flajolet and Odlyzko (1990a) contains many other applications of trans- 
fer theorems to random mapping problems. Additional recent papers on the cycle 
structure of random permutations are Arratia and Tavaré (1992a) and Hansen 
(1994). They use probabilistic methods, not transfer theorems, and contain exten- 
sive references to other recent works. & 


In applying transfer theorems, it is useful to have explicit expansions and esti- 
mates for the coefficients of some frequently occurring functions. We state several 
asymptotic series: 


[z"\(1 —z)* = = (: + Eee] , @#£0,1,2,..., (11.12) 
k21 
where 
2k 
= S\(-1 Ag (a+ Ia +2): (a tf), (11.13) 
j=k 


and the A, ; are determined by 


So AR joke (11.14) 


e(t+ur) = 
k,j20 
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e) — ala +1)/2, 


eS”) = a(a + 1)(a +2)(3a+1)/24. 
Also, for a, B ¢ {0,1,2,...}, 


“fe = 2)*(-e "Nog ~ 2998 © 7 dog)? (1+ rel" doen) *), 
ms (11.15) 
where 
ik 
fh = (rea (res |.) - (11.16) 


Further examples of asymptotic expansions are presented in Flajolet and Odlyzko 
(1990b). 


Why is the analyticity of a function f(z) throughout A(r, ¢, 7) \ {r} so impor- 
tant? We explain this using as an example a function f(z) that satisfies 


f(z) = (1+ o(1))(1 = z)'? (11.17) 
as z > 1 with z € A = A(1, 7/8, 1). We write 

f(z) = (1-2)? + 9(z), (11.18) 
so that 

Ig(z)| = o(|1 - z|"”) . (11.19) 
Since [z"](1 — z)!/? grows like n-3/2, we would like to show that 

[z"lg(z)| = o(n7/?)_ asn 00, (11.20) 


If g(z) were analytic in a disk of radius 1+6 for some 6 > 0, then we could 
conclude that |[z”]g(z)| < (1+ 6/2) for large n, a conclusion much stronger than 
(11.20). However, if all we know is that g(z) satisfies (11.19) in |z| < 1, then we can 
only conclude from Cauchy’s theorem that [z"]g(z) = O(1), since (11.19) implies 
that |g(z)| < C for all |z| < 1 and some C > 0. Then Theorem 10.2 gives 


Iz"lg(z)| < er” (11.21) 


uniformly for all n > 0 and all r < 1, and hence |[z"|g(z)| < C for all n, a result 
that is far from what is required. If we know that g(z) can be continued to A \ {r} 
and satisfies (11.19) there, we can do a lot better. We choose the contour [= 
Y,UP,U06F, UN, pictured in fig. 1, with 


Fy = {z: |z—1| =1/n, {Arg(z — 1)| > 0/4}, (11.22) 
IP, = {z:z=1+rexp(mi/4), l/n<r< 5}, (11.23) 
P; = {z: |z| = {1+ Sexp(wi/4)|, |Arg(z — 1)| > 7/4}, (11.24) 


My= {z: z=1+rexp(—mi/4), 1/n <r < 5}, (11.25) 
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Figure 1. Domain A(r, ¢, 7) of section 11.1 and the integration contour I. 


where 0 < 6 < 1/2. We will show that the integrals 
= i —n-\ 
8i= 55 [sex dz (11.26) 
on the I; are small. On I, g(z) is bounded, so we trivially obtain the exponential 
upper bound 
\gs| = O((1 + 6/2)™) . (11.27) 


On I, |g(z)| = o(n7?), [2-1] < (1 — 1/n)-""! = O(1), and the length of T, is 
< 2a/n, so 


lei} =o(n3) asn— oo. (11.28) 
Next, on 4, for z = 1+ rexp(wi/4), 
f2\-" = [1 + 27/2 + ir2-/2[-" = (1+ 22 4?) 0? 
<(1+r)"? < exp(—nr/10) : (11.29) 
for 0 <r <1. Since g(z) satisfies (11.19), for any « > 0 we have 


\g(1 + rexp(ai/4))| < er'/? (11.30) 
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if0<r< 7 for some n = ne) < 5. Therefore 
Wy 00 
gol<e] r'/exp(—nr/10)dr+O exp(—nr/10) dr 
: P 
q 


< en-3?2 [ r!/? exp(—r/10) dr + O(exp(—nn/10)) , (11.31) 
0 
and so 
|g2| = o(n*/?) . (11.32) 


Since |g4| = |g2|, inequalities (11.27), (11.28), and (11.32) show that (11.20) holds. 

The critical factor in the derivation of (11.20) was the bound (11.29) for |z|~" on 
the segment z = 1+ ,rexp(wi/4). Integrating on the circle |z| = 1 or even on the 
line Re (z) = 1 does not give a bound for |z|~” that is anywhere as small, and the 
resulting bounds do not approach (11.20) in strength. The use of the circular arc 
I, in the integral is only a minor technical device used to avoid the singularity at 
z=1. 

When one cannot continue a function to a region like A \ {1}, it is sometimes 
possible to obtain good estimates for coefficients by working with the generating 
function exclusively in |z| < 1, provided some smoothness properties apply. This 
method is outlined in the next section. 


11.2. Darboux’s theorem and other methods 


A singularity of f(z) at z = w is called algebraic if f(z) can be written as the sum 
of a function analytic in a neighborhood of w and a finite number of terms of the 
form 


(1 — z/w)%g(z) , (11.33) 


where g(z) is analytic near w, g(w) 4 0, and a ¢ {0,1,2,...}. Darboux’s theorem 
(Darboux 1878) gives asymptotic expansions for functions with algebraic singular- 
ities on the circle of convergence. We state one form of Darboux’s result, derived 
from Theorem 8.4 of Szegé (1959). 


Theorem 11.7. Suppose that f(z) is analytic for \z\| <r, r > 0, and has only alge- 
braic singularities on |z| =r. Let a be the minimum of Re (a) for the terms of the 
form (11.33) at the singularities of f(z) on |z| =r, and let w;, aj, and gj{z) be the 
w, a, and g(z) for those terms of the form (11.33) for which Re (a) = a. Then, as 
n — 00, 


7 ‘ -aj~i 
ie" fle) — J tole tnt), (1134) 
j j 


Jungen (1931) has extended Darboux’s theorem to functions that have a sin- 
gle dominant singularity which is of a mixed algebraic and logarithmic form. His 
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method can be applied also to functions that have several such singularities on 
their circle of convergence. 

We do not devote much attention to Darboux’ 's and Jungen’s theorems because 
they can be obtained from the transfer theorems of section 11.1. The only reason 
for stating Theorem 11.7 is that it occurs frequently in the literature. 

Some functions, such as 


oC 
f(z) = ]]G+z/k’), (11.35) 
k=I 
are analytic in |z| < 1, cannot be continued outside the unit circle, yet are nicely 
behaved on |z| = 1. Therefore there is no dominant singularity that can be studied 
to determine the asymptotics of (z"|f(z). To minimize the size of the integrand, it 
is natural to move the contour of integration in Cauchy’s formula to the unit circle. 
Once that is done, it is possible to exploit smoothness properties of f(z) to bound 
the coefficients. The Riemann—Lebesgue lemma implies that if f(z) is integrable 
on the unit circle, then as n — co, 


[z"|f(z) = (20)! / . fe?) exp(—nid) dé = o(1) . (11.36) 


More can be said if the derivative of f(z) exists on the unit circle. When we apply 
integration by parts to the integral in (11.36), we find 


Ee"fle) = any [ " '(e%) exp(—(n — 1)i8) 40 , (1137) 


and so |[z"]f(z)| = o(17') if f’(z) exists and is integrable on the unit circle. Exis- 
tence of higher derivatives leads to even better estimates. We do not attempt to 
state a general theorem, but illustrate an application of this method with an exam- 
ple. The same technique can be used in other situations, for example in obtaining 
better error terms in Darboux’s theorem (Darboux 1878). 


Example 11.8 (Permutations with distinct cycle lengths). Example 8.10 showed that 
for the function f(z) defined by eq. (8.58), [z"]f(z) ~ exp(—y) as n — oo. This co- 
efficient is the probability that a random permutation on n letters has distinct cycle 
jengths. The more precise estimate (8.59) was derived by Greene and Knuth (1982) 
by working with recurrences for the coefficients of f(z) and auxiliary functions. 
Another approach to deriving fuller asymptotic expansions for (z”|f(z) is to use 
the method outlined above. It suffices to show that the function g({z) defined by 
eq. (8.62) has a nice expansion in the closed disk |z| < 1. Since 


g(e)=-2+ 


m=2 


1y"- 1 
{Li,,(z) — z""} , (11.38) 
y 
where the Li,,(w) are the polylogarithm functions (Lewin 1981), one can use the 
theory of the Li,,{(w). A simpler way to proceed is to note, for example, that 
7 Ses ak 
at > kk —1) t+r(z), (11.39) 
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where 


2k 


r(z) = =e ae (11.40) 


k=2 


and so r'(z) is bounded and continuous for [z| < 1, as are the terms in (8.62) with 
m > 3. On the other hand, 


= we pre the eet = 29), (11.41) 
k=2 


so we can write g(z) = g:(z) + g2(z), where g)(z) is an explicit function [given by 
eq. (11.41)] such that the coefficients of exp(g,(z)) can be estimated asymptotically 
using transfer methods or other techniques, and g,(z) has the property that g5(z) is 
bounded and continuous in |z| < 1. Continuing this process, we can find, for every 
K, an expansion for the coefficients of f(z) that has error term O(n—*). To do this, 
we write g(z) = G;(z) + G2(z). In this expansion G,(z) will be explicitly given and 
analytic inside |z| < 1 and analytically continuable to some region that extends 
beyond the unit disk with the exception of cuts from a finite number of points on 
the unit circle out to infinity. Further, G2(z) will have the property that cw) (z) is 
bounded and continuous in |z| < 1. This will then give the desired expansion for 
the coefficients of f(z). w 


12, Large singularities of analytic functions 


This section presents methods for asymptotic estimation of coefficients of gener- 
ating functions whose dominant singularities are large. 


12.1. The saddle point method 


The saddle point method, also referred to as the method of steepest descent, is 
by far the most useful method for obtaining asymptotic information about rapidly 
growing functions. It is extremely flexible and has been applied to a tremendous 
variety of problems. It is also complicated, and there is no simple categorization 
of situations where it can be applied, much less of the results it produces. Given 
the purpose and limitations on the length of this chapter, we do not present a 
full discussion of it. For a complete and insightful introduction to this technique, 
the reader is referred to de Bruijn (1958). Many other books, such as Evgrafov 
(1961), Fedoryuk (1989a), Olver (1974), and Wong (1989) also have extensive 
presentations. What this section does is to outline the method and show when and 
how it can be applied and what kinds of estimates it produces. Examples of proper 
and improper applications of the method are presented. Later subsections are 
then devoted to general results obtained through applications of the saddle point 
method. These results give asymptotic expansions for wide classes of functions 
without forcing the reader to go through the details of the saddle point method. 
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The saddle point method is based on the freedom to shift contours of integration 
when estimating integrals of analytic functions. The same principle undcrlics other 
techniques, such as the transfer method of section 11.1, but the way it is applied 
here is different. When dealing with functions of slow growth near their principal | 
singularity, as happens for transfer methods, one attempts to push the contour of 
integration up to and in some ways even beyond the singularity. The saddle point 
method is usually applied when the singularity is large, and it keeps the path of 
integration close to the singularity. 7 

In the remainder of this section we will assume that f(z) is analytic in |z] < R < 
oo. We will also make the assumption that for some Ro, if Ro <r < R, then 


pas lf(z)| = f(r) - (12.1) 


This assumption is clearly satisfied by all functions with real nonnegative coeffi- 
cients, which are the most common ones in combinatorial enumeration. Further, we 
will suppose that z = r is the unique point with |z| =r where the maximum value 
in (12.1) is assumed. When this assumption is not satisfied, we are almost always 
dealing with some periodicity in the asymptotics of the coefficients, and we can 
then usually reduce to the standard case by either changing variables or rewriting 
the generating function as a sum of several others, as was discussed in section 10. 
[Such a reduction cannot be applied to the function of eq. (9.39), though] 

The first step in estimating [z"|f(z) by the saddle point method is to find the 
saddle point. Under our assumptions, that will be a point r € (Ro, R) which min- 
imizes r-"f(r). We have encountered this condition before, in section 8.1. The 
minimizing r = ro will usually be unique, at least for large n. (If there are several 
r € (Ro, R) for which r—"f(r) achieves its minimum value, then f(z) is pathological, 
and the standard saddle point method will not be applicable. For functions f(z) 
with nonnegative coefficients, it is easy to show uniqueness of the minimizing r, as. 
has already been discussed in section 8.1.) Cauchy’s formula (10.6) is then applied 
with the contour {z]| = ro. The reason for this choice is that for many functions, 
on this contour the integrand is large only near z = ro, the contributions from | 
the region near z = rp do not cancel each other, and remaining regions contribute - 
little. This is in contrast to the behavior of the integrand on other contours. By 
Cauchy’s theorem, any simple closed contour enclosing the origin gives the correct 
answer. However, on most of them the integrand is large, and there is so much 
cancellation that it is hard to derive any estimates. The circle going through the 
saddle point, on the other hand, yields an integral that can be controlled well 
by techniques related to Laplace’s method and the method of stationary phase 
that were mentioned in section 5.5. We illustrate with an example, which is a to- 


tally self-contained application of the saddle point method to an extremely simple 
Situation. 


Example 12.1 (Stirling’s formula). We estimate (n!)~! = [z”| exp(z). The saddle 
point, according to our definition above, is that r € R* that minimizes r~" exp(r), 
which is clearly r =n. Consider the contour |z| = 7, and set z = nexp(i0), —7 < 
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0 <1. Then 
1 exp(z) 
n ey Gite d 
l"emple)= a5 | Ser a2 
1 i —n id . 
= iz exp(ne™ — nid) dé. ; (12.2) 


Since |exp(z)| = exp(Re(z)), the absolute value of the integrand in (12.2) is 
n~" exp(n cos @), which is maximized for @ = 0. Now 


e? — cos@ +isin® = 1 — 67/2+i10 + O(a) , 
so for any 4 € (0,77), 


Oy : % 
/ n-"exp(ne — nid) d@ = i n-" exp(n — 07/2 + O(n|a[>)) de . 
=f 6 
(123) 


(It is the cancellation of the ni@ term coming from ne'® and the —ni€é term that 
came from change of variables in z~” that is primarily responsible for the success 
of the saddle point method.) The O(m|6|*) term in (12.3) could cause problems if 
it became too large, so we will select 6) = n~2/5, so that n|@|? < n-/5 for |@] < 6, 
and therefore 


exp(n — n67/2 + O(n|0|°)) = exp(n — n2/2)(1 + O(n") . (12.4) 
Hence 
% 8 
/ n"exp(ne® — nid) dé = (1+ O(n-"5))n-" e" / exp(—n07/2)dé@ . 
- ~% 
But 
oa exp(—n6?/2)d0 = ‘a exp(—n6?/2) da — 2 fe exp(—07/2) da 
% 90 % 
= (2m/n)'? — Ofexp(—n'/*/2)) , 
so 


/ 5 n"exp(ne® — nid) de = (1+ O(n—"))(2a/n)'?n-"e" (12.5) 
Oy 


On the other hand, for @ < |@| <a, 
cos 8 < cos 6 = 1 — 0¢/2 + O(63) , 
so 


ncos 0 <n—n'/5/2+O0(n-/) , 
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and therefore for large n 


i nn” exp(ne” — nid) d@| <n" exp(n - nV! /3) ’ 
% 


and similarly for the antegrade from —m to —6. Combining all these estimates we 
therefore find that 


(n!)"! = [z"] exp(z) = (1+ O(n" '*))(2an)'?n" e” , (12.6) 
which is a weak form of Stirling’s formula (4.3). (The full formula can be derived 
by using more precise expansions for the integrand.) 


Suppose we try to push through a similar argument using the contour |z| = 2n. 
This time, instead of eq. (12.2), we find 


{z"] exp(z) = = j 2-"n™" exp(2ne!® — nid) de . (12.7) 


At 6 =O, the integrand is 2-"n-" exp(2n), which is exp(n) times as large as the 
value of the integrand in (12.2). Since the two integrals do produce the same 
answer, and from the analysis above we see that this answer is close to n~" exp(n) 
in value, the integral in (12.7) must involve tremendous cancellation. That is indeed 
what we see in the neighborhood of 6 = 0. We find that 


exp(2ne!® — nid) = exp(2n — n6? + nid + O(n|a|°)) , (12.8) 


and the exp(ni@) term produces wild oscillations of the integrand even over smail 
ranges of @. Trying to work with the integral (12.7) and proving that it equals some- 
thing exponentially smaller than the maximal value of its integrand is not a promis- 
ing approach. By contrast, the saddle point contour used to produce eq. (12.2) gives 
nice behavior of the integrand, so that it can be evaluated. Bw 


The estimates for n! obtained in Example 10.1 came from a simple application 
of the saddle point method. The motivation for the choice of the contour |z| =” 
is provided by the discussion at the end of the example; other choices lead to 
oscillating integrands that cannot be approximated by a Gaussian, nor by any 
other nice function. The example above treated only the exponential function, 
but it is easy to see that this phenomenon is general; a rapidly oscillating term 
exp(nia) for a # 0 is present unless the contour passes through the saddle point. 
When we do use this contour, and the Gaussian approximation is valid, we find 
that for functions f(z) satisfying our assumptions we have the following estimate. 


Saddle point approximation. 


[2"1f(z) ~ (2b(r0))7 Fro)” asin > 00, (12.9) 
where ro is the saddle point (where r~"f(r) is minimized, so that rof’ (ro) /f(ro) =n) 


and 
£O) af) 2 (fOl_ (£0 
HO =r TF) -° (FB) =r(FQ) | vane 
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Example 12.2 (Bell numbers). Example 5.4 showed how to estimate the Bell num- 


ber B,, by elementary methods, starting with the representation (5.38). The expo- 
nential generating function 


co z" 
satisfies 


B(z) = exp(exp(z) — 1), 


as can be seen from (5.38) or by other methods (cf. Comtet 1974). The saddle 
point occurs at that rp > 0 that satisfies 


roexp(7o) =n , (12.12) 
and 

b(ro) = ro(1 + ro) exp(7o) , (12.13) 
so the saddle point approximation says that as 7 — oo, 

By, ~ n\(2arg exp(ro))/? exp(exp(r9) ~ 1)r9" - (12.14) 
The saddle point approximation can be justified even more easily than for the 
Stirling estimate of n!. pe 


The above approximation is widely applicable and extremely useful, but care 
has to be exercised is applying it. This is shown by the next example. 


Example 12.3 (Invalid application of the saddle point method). Consider the ttiv- 
ial example f(z) = (1 — z)~', so that {z"|f(z) = 1 for all n > 0. Then f’(r)/f(r) = 
(1—r)~', and so the saddle point is ro = n/(n+1), and b(r9) = ro/(A — ro)? = 
n(n +1). Therefore if the approximation (12.9) were valid, it would give 


e"Ifl2) ~ am(n+ (n+) (142) 
~ (2m) e asn—oo. (12.15) 


Since (27) '/7 e = 1.0844... # 1 = [z"]f(z), something is wrong, and the estimate 
(12.9) does not apply to this function. J 


The estimate (12.9) gave the wrong result in Example 12.3 because the Gaussian 
approximation on the saddle-point-method contour used so effectively in Exam- 
ple 12.1 (and in almost all cases where the saddle point method applies) does not 
hold over a sufficiently large region for f(z) = (1 — z)~!. In Example 12.1 we used 
without detailed explanation the choice 6) = n-2/>, which gave the approximation 
(12.5) for ]6| < 4, and yet led to an estimate for the integral over % < |6| <a 
that was negligible. This was possible because the third-order term (i.e., n|@|°) in 
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eq. (12.5) was small. When we try to imitate this approach for f(z) = a—z)', 
we fail, because the third-order term is too large. Instead of ne'® — nid, we now 
have 


. 


~ log(1 — re’). — nid = — log(1 — ro) — ann + 1)6? — ann +1)034+.--. 

(12.16) 

More fundamentally, the saddle point method fails here because the function 

f(z) = (1—z)~! does not have a large enough singularity at z = 1, so that when 

one traverses the saddle point contour |z| = 79, the integrand does not drop off, 

rapidly enough for a small region near the real axis to provide the dominant con- 
tribution. 

When can one apply the saddle point approximation (12.9)? Perhaps the sim- 
plest, yet still general, set of sufficient conditions for the validity of (12.9) is pro- . 
vided by requiring that the function f(z) be Hayman-admissible. Hayman ad- 
missibility is described in Definition 12.4, in the following subsection. Generally 
speaking, though, for the saddle point method to apply we need the function f(z) | 
to have a large dominant singularity at R, so that f(r) grows at least as fast as 
exp((log(R — r))*) as r — R™ for R < oo, and as fast as exp((log r)*) as r — 00 for 
R = oo. The faster the growth rate, the easier it usually is to apply the method, so 
that exp(1/(1 — z)) or exp(exp(1/(1 — z))) can be treated easily. 

In our application of the saddle point method to exp(z) in Example 12.1 we 
were content to obtain a poor error term, 1 + O(n~"/5), in Stirling’s formula for 
n!. This was done to simplify the presentation and concentrate only on the main 
factors that make the saddle point method successful. With more care devoted to 
the integral one can obtain the full asymptotic expansion of n!. (Only the range’ . 
18| < 0 has to be considered carefully.) This is usually true when the saddle point 
method is applicable. 

This section provided a sketchy introduction to the saddle point method. For 
a much more thorough presentation, including a discussion of the topographical 
view of the integrand and the “hill-climbing” interpretation of the contour of 
integration, see de Bruijn (1958). 


12.2. Admissible functions 


The saddle point method is a powerful and flexible tool, but in its full generality 
it is often cumbersome to apply. In many situations it is possible to apply general 
theorems derived using the saddle point method that give asymptotic approxima- 
tions that are not the sharpest possible, but which allow one to avoid the drudgery 
of applying the method step by step. The general theorems that we present were 
proved by Hayman (1956) and by Harris and Schoenfeld (1968). We next describe 
the classes of functions to which these theorems apply, and then present the esti- 
mates one obtains for them. It is not always easy to verify that these definitions 
hold, but it is almost always easier to do this than to apply the saddle point method 
from scratch. [t is worth mentioning, furthermore, that for many generating func- 
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tions, there are conditions that guarantee that they satisfy the hypotheses of the 


Hayman and the Harris-Schoenfeld theorems. These conditions are discussed later: 


in this section. 
The definition below is stated somewhat differently than the original one in 
Hayman (1956), but can be shown to be equivalent to it. 


. Definition 12.4. A function 


fa) => faz" (12.17) 
n=O . 
is admissible in the sense of Hayman (or H-admissible ) if 
(i) f(z) is analytic in |z| < R for some 0 < R < 0, 
(ii) f(z) is real for z real, |z{ < R, 
(iti) for Ry <r < R, 


ie fh =f), (12.18) 
(iv) for ' 
en A 
a(r)=r fin)’ (12.19) 
ale py cer7 f(r) pl <2 f(r) 4 
beraral) =r Fy RH (G5) vey 


and for some function 8(r), defined in the range Ry < r < R to satisfy 0 < 8(r) < a, 
the following three conditions hold: 


(a) f(re®) ~ f(r) exp(ida(r) — @7b(r)/2) 


as r — R uniformly for |6| < 8(r), (12.21) 
(b) f(re'*) = off (r)b(r)"'”) 

as r — R uniformly for {| < (r), (12.22) 
(c) b(r) > 00 asr > R. ‘ (12.23) 


For H-admissible functions, Hayman (1956) proved a basic result that gives the 
asymptotics of the coefficients. 


Theorem 12.5. If f(z), defined by eq. (12.17), is H-admissible in |z| < R, then 


fa = Qub(r)) 77 f(r {exp (oo) + a(t} (12.24) 


as r — R, with the o(1) term uniform in n. 


If we choose r = r, to be a solution to a(r,) =n, then we obtain from Theo- 
rem 12.5 a simpler result. (The uniqueness of r, follows from a result of Hayman 


(1956) which shows that a(r) is positive increasing in some range R, <r < R, 
Ri > Ro.) 


ae a 
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Corollary 12.6. If f(z), defined by eq. (12.17), is H-admissible in |z| < R, then 
fa ~ (2b(tn)) fF (ra)r" asin — 00 , (12.25) 
where r, is defined uniquely for large n by a(r,) =n, Ro <1n < R. 


Corollary 12.6 is adequate for most situations. The advantage of Theorem 12.5 
is that it gives a uniform estimate over the approximate range |a(r) — n|<b(r)'/2. 
[Note that the estimate (12.24) is vacuous for |a(r) —n| b(r)~/2 — oo] Theo- 
rem 12.5 shows that the f,r” are approximately Gaussian in the central region. 

There are many direct applications of the above results. 


Example 12.7 (Stirling’s formula). Let f(z) = exp(z). Then f(z) is H-admissible 
for R = oo; conditions (i)-(iii) of Definition 12.4 are trivially satisfied, while a(r) = 
r, b(r) =r, so (iv) also holds for Ro = 0, &(r) = r~'/3, say. Corollary 12.6 then 
shows that 


1 
=~ (2an)"'/2 e"n™" asn— oo, (12.26) 
since r, =n, which gives a weak form of Stirling’s approximation to n!. ed 


In many situations the conditions of H-admissibility are much harder to verify 
than for f(z) = exp(z), and even in that case there is a little work to be done to 
verify that condition (iv) holds. However, many of the generating functions one en- 
counters are built up from other, simpler generating functions, and Hayman (1956) 
has shown that often the resulting functions are guaranteed to be H-admissible. 
We summarize some of Hayman’s results in the following theorem. 


Theorem 12.8. Let f(z) and g(z) be H-admissible for |z| < R < 0. Let h(z) be 
analytic in |z| < R and real for real z. Let p(z) be a polynomial with real coefficients. 
(i) If the coefficients a, of the Taylor series of exp(p(z)) are positive for all 
sufficiently large n, then exp(p(z)) is H-admissible in |z| < co. 
(ii) exp(f(z)) and f(z)g(z) are H-admissible in |z| < R. 
(iii) [f, for some y > 0, and Ri <r<R, 
max |A(z)| = O(f(r)'”) , (12.27) 


then f(z) : a2) is H-admissible in |z| < RIn particular, f(z) + p(z) is H-admissible 


in |z| < R and, if the leading coefficient of p(z) is positive, p({(z)) is H-admissible 
in |z|< R. 


Example 12.9 (H-admissible functions). (a) By (i) Theorem 12.8, exp(z) is H- 
admissible, so we immediately obtain the estimate (12.26), which yields Stirling’s 
formula. (b) Since exp(z) is H/-admmissible, part (iii) of Theorem 12.8 shows that 
exp(z) — 1 is H-admissible. (c) Applying part (ii) of Theorem 12.8, we next find that 
exp(exp(z) -- 1) is #4-admissible, which yields the asymptotics of the Bell numbers. 

ze 
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Hayman’s results give only first-order approximations for the coefficients of //- 
admissible functions. In some circumstances it is desirable to obtain full asymptotic 
expansions. This is possible if we impose additional restrictions on the generating 
function. We next state some results of Harris and Schoenfeld (1968). 


Definition 12.10. A function f(z) defined by eq. (12.17) is HS-admissible provided 
it is analytic in |z| < R, 0 < R < o©, is real for real x, and satisfies the following 
conditions: 


(A) There is an Ro, 0 < Ro < R and a function d(r) defined for r € (Ro, R) such 
that 


O<d(r)<1, r{l+d(r)}<R, (12.28) 


and such that f(z) 4 0 for [z —r| < rd(r). 
(B) If we define, for k > 1, 


Aq) = FS, By(z) = | zat Xz), B(z) = SBilz) , (12.29) 


then we have 
B(r)>0 for Ro <r<R and Br) 00 asr-R. 


(C) For sufficiently large R, and n, there is a unique solution r = u, to 


Bir) =n4+1, Rp<r<R. (12.30) 
Let 

Cleon) = gay {Biate)+ Pan} . (1231) 
There exist nonnegative Dn, En, an No such that for n > no, 

[C)(unsUn)| < EnDi, j=1,2,.... (12.32) 

(D) Asn — 00, 

B(un)d(un)? — oo, 

D,E,B(un)d(tny 0, (12.33) 

Dnd(un) — 0. 


For HS-admissible functions, Harris and Schoenfeld obtain complete asymptotic 
expansions. 


Theorem 12.11. If f(z), defined by (12.17), is HS-admissible, then for any N > 0, 
N 
=2(mB ny? Ff (un un” + 55 F(t) By * + OC y(n; ay} asn- oo, 


k=1 
(12.34) 
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where 
Bn = Bun) , (12.35) 
r f 
F,(n) = ue > ae 2 OnE OF (12.36) 
y(n) = Cj(uns Un) » (12.37) 
and 
dy (nd) = max{ (Un, d), E,(DnEnBn PN}, 
with 
F},=min(1,E,), E, = max(1, En) , (12.38) . 
—B(r)d(r)’) 
b(rsd) = max { rs dB(r)'”, epee (1239) 


where X(r;d) is the maximum value of |f'(z)/f(z)| for z on the oriented path Q(r) ‘ 
consisting of the line segment from r+ ird(r) to (1 -— 


circular arc from the last point to ir to —r. 


d(r)*)'/? + ird(r) and of the 


The conditions for HS-admissibility are often hard to verify. However, there is a 
theorem (Odlyzko and Richmond 1985c) which guarantees that they do hold for 


a large class of interesting functions. 


Theorem 12.12. If g(z) is H-admissible, then f(z) = exp(g(z)) is HS-admissible. 
Furthermore, the error term dy(n,d) of Theorem 12.11 is then 0(B,%) as n — 00 


for every fixed N > 0. 


Example 12.13 (Bell numbers and HS-admissibility). Since exp(x)—1 is H-ad- 
missible, as we saw in Example 12.9, we find that exp(exp(z) — 1) is HS-admissible, 
and Theorem 12.11 yields a complete asymptotic expansion of the Bell numbers. 


& 


Theorem 12.12 does not apply when g(z) is a polynomial. As is pointed out 
by Schmutz (1989), for g(z) = z+ ~ z3 +z? the function f(z) = exp(g(z)) is HS- 
admissible, but Theorem 12.11 does not give an asymptotic expansion because 
the error term ¢,(a;d) is too large. Schmutz (1989) has obtained necessary and 
sufficient conditions for Theorem 12.11 to give an asymptotic expansion for the 
coefficients of f(z) = exp(g(z)) when g(z) is a polynomial. 


12.3. Other saddle point applications 


Section 12.1 presented the basic saddle point method and discussed its range of 
applicability. Section 12.2 was devoted to results derived using this method that are 


« 
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general and yet can be applied in a cookbook style, without a deep understanding 
of the saddle point technique. Such a cookbook approach is satisfactory in many 
situations. However, often one encounters asymptotic estimation problems that 
are not covered by any of general results mentioned in section 12.2, but can be 
solved using the saddle point method. This section mentions several such results of 
this type that illustrate the range of problems to which this method is applicable. 
Additional applications will be presented in section 15, where other techniques are 
combined with the saddle point method. 


Example 12.14 (Stirling numbers). The Stirling numbers of the first kind, s(n, k), 
satisfy (6.5) as well as (Comtet 1974): 


n 


Yo s(n, ky2* = 2(z -1)---(e -n +1). (12.40) 
k=0 


Since (~1)"**s(n, k) > 0, [which is reflected in the behavior of the generating func- 
tion (12.40), which grows faster along the negative real axis than along the positive 
one], we rewrite it as 


y(-ys(n, k)z* = z(z +1)---(z+n—1). (12.41) 
k=0 


The function on the right-hand side behaves like a good candidate for an appli- 
cation of the saddle point method. For details, see Moser and Wyman (1958a,b). 


The estimates mentioned in Example 12.14 are far from best possible in either 
the size of the error term or (more important) in the range of validity. References 
for the best currently known results about Stirling numbers of both the first and 
second kind are given in Temme (1993). Some of the results in the literature are 
not rigorous. For example, Temme (1993) presents elegant and uniform estimates 
based on an application of the saddle point method. They are likely to be correct, 
but the necessary rigorous error analysis has not been performed yet, although it 
seems that this should be doable. Other results, like those of Knessi and Keller 
(1991) are obtained by methods that there does not seem to be any hope of making 
Tigorous in the near future. Some of the results, though, such as the original ones 
of Moser and Wyman (1958a,b), and the more recent one of Wilf (1993), are fully 
proved. 

The saddle point method can be used to obtain full asymptotic expansions. 
These expansions are usually in powess of n~'/? when estimating [z”|f(z), and 
they hardly ever converge, but are asymptotic expansions as defined by Poincaré 
[as in eq. (2.2)]. The usual forms of the saddle point method are incapable of 
providing expansions similar to the Hardy-Ramanujan-Rademacher convergent 
series for the partition function p(n) [eq. (3.1)]. However, the saddle point method 
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can be applied to estimate p(n). There are technical difficulties, since the generating 
function . ; : 


f(z) = So p(n)z" = [[a-ey! (12.42) 
ki 


aot 


has a large singularity at z = 1, but in addition has singularities at all other roots of 
unity. The contribution of the integral for z away from 1 can be crudely estimated to 
be O(n-! exp(Cn'/2 /2)) {the last term in eq. (1.5)]. A simple estimate of the integral 
near z = 1 yields the asymptotic expansion of eq. (1.6). A more careful treatment of 
the integral, but one that follows the conventional saddle point technique, replaces 
the 1+ O(n~'/?) term in eq. (1.6) by an asymptotic (in the sense of Poincaré, so 
nonconvergent) series )>c,n~*/?, To obtain eq. (1.5), one needs to choose the 
contour of integration near z = 1 carefully and use precise estimates of f(z) near 
z=l, 

De Bruijn (1958) also discusses applications of the saddle point method when 
the saddle point is not on the real axis, and especially when there are several 
saddle points that contribute comparable amounts. This usually occurs when there 
are oscillations in the coefficients. When the oscillations are irregular, the tricks 
mentioned in section 10 of changing variables do not work, and the contributions 
of the multiple saddle points have to be evaluated. 


Example 12.15 (Oscillating sequence). Consider the sequence a, of Examples 9.5 
and 10.3. As is shown in Example 9.5, its ordinary generating function is given by 
(9.39). It has an essential singularity at z = 1, but is analytic everywhere else. This 
function is not covered by our earlier discussion. For example, its maximal value is 
in general not taken on the positive real axis. It can be shown that the Cauchy inte- 
gral has two saddle points, at approximately z = 1 — (2n)~! + in—'/2(1 — (4n)7!)!/2, 
Evaluating (z”|f(z) by using Cauchy’s theorem with the contour chosen to pass 
through the two points in the correct way yields the estimate (9.38). a 


In applying the saddle point method, a general principle is that multiplying a 
generating function f(z) with dominant singularity at R by another function g(z) 
which is analytic in |z| <_R and has much lower growth rate near z = R yields 


a function f(z)g(z) whose saddle point is close to that of f(z). Usually one can 
obtain a relation of the form 


[2"](F(z)g(z)) ~ g(ro([z"]F(z)) , (12.43) 


where rp is the saddle point for f(z). This principle (which is related to the one 
behind Theorem 7.1) is useful, but has to be applied with caution, and proofs have 
to be provided for each case. For fuller exposition of this principle and general 
results, see Gardy (1995). The advantage of this approach is that often f(z) is 
easy to manipulate, so the determination of a saddle point for it is easy, whereas 
multiplying it by g(z) produces a messy function, and the exact saddle point for 
f(z)g(z) is difficult to determine. 
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Example 12.16 (Boolean lattice of subsets of {1,...,n}). The number a, of Bool- 
ean sublattices of the Boolean lattice of subsets of {1,...,} has the exponential 
generating function (Getu et al. 1992): 


A(z) = Yaz = exp(2z + exp(z) — 1). (12.44) 


n=0 


We can write A(z) = exp(2z)B(z), where B(z) is the exponential generating func- 
tion for the Bell numbers (ple 12.2). Since B(z) grows much faster than exp(2z), 
it is easy to show that (12.43) applies, and so 


yn ~ eXp(2r9)Bn aS N— 00, (12.45) 


where ro is the saddle point for B(z). Using the approximation (12.12) of Exam- 
ple 12.2, we find that 


a, ~ (n/logn)?B, asn— oo. (12.46) 
& 


The insensitivity of the saddle point approximation to slight perturbations is re- 
flected in slightly different definitions of a saddle point that are used. The saddle 
point approximation (12.9) for [z"]f(z) is stated in terms of ro, the point that min- 
imizes f(r)r-”. The discussion of the saddle point emphasized minimization of the 
peak value of the integrand in Cauchy’s formula, which is the same as minimizing 
f(r)r-""', since the contour integral (10.6) involves f(z)z~"~'. Some sources call 
the point minimizing f(r)r~"~' the saddle point. It is not important which defini- 
tion is adopted. The asymptotic-series coefficients look slightly differently in the 
two cases, but the final asymptotic series, when expressed in terms of n, are the 
same. The reason for slightly preferring the definition that minimizes f(r)r—" is 
that when the change of variable z = r exp(i@) is made in Cauchy’s integral, there 
is no linear term in 6, and the integrand involves exp(—cn02 + O(|6|?)). If we min- 
imized f(r)r-""', we would have to deal with exp(—c’i@ — c’n6? + O((6[3)), which 
is not much more difficult to handle but is less elegant. 

The same principle can be applied when the exact saddle point is hard to de- 
termine, and it is awkward to work with an implicit definition of this point. When 
that happens, there is often a point near the saddle point that is easy to handle, 
and for which the saddle point approximation holds. We refer to Gardy (1995) for 
examples and discussion of this phenomenon. 


12.4. The circle method and other techniques 


As we mentioned in section 12.3, the saddle point method is a powerful method 
that estimates the contribution of the neighborhood of only a single point, or at 
most a few points. The convergent series of eq. (1.3) for the partition function p(7) 
(as. well as the earlier nonconvergent but asymptotic and very accurate expansion 
of Hardy and Ramanujan) is obtained by evaluating the contribution of the other 
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singularities of f(z) to the integral. The mth term in eq. (1.3) comes from the 
primitive mth roots of unity. To obtain this expansion one needs to use a special 
contour of integration and detailed knowledge of the behavior of f(z). The details 
of this technique, called the circle method, can be found in Andrews (1976) and 
Ayoub (1963). 

Convergent series can be obtained from the circle metliod only when the gener- 
ating function is of a special form. For results and references, see Almkvist (1991) 
and Almkvist and Andrews (1991). 

Nonconvergent but accurate asymptotic expansions can be derived from the 
circle method in a much wider variety of applications. It is especially useful when 
there is no single dominant singularity. For the partition function p(n), all the — 
singularities away from z =1 contribute little, and it is z =1 that creates the 
dominant term and yields eq. (1.6). For other functions this is often false. For 
example, when dealing with additive problems of Waring’s type, where one studies. 
Nx m(n), the number of representations of a nonnegative integer n as 


m 
n= oxi, x)€Z'U{O} forall j, (12.47). 


~~. 


the natural generating function to study is 


SNe (n)z" = g(z)” , (12.48) 
n=0 
where 
a(z)=Soc™. (12.49) | 
h=0 


The function g(z) has a natural boundary at |z| = 1, but it again grows fastest as 
z approaches a root of unity from within |z| < 1, so it is natural to speak of g(z) 
having singularities at the roots of unity. The singularity at z = 1 is still the largest, 
but not by much, as other roots of unity contribute comparable amounts, with 
the contribution of other roots of unity ¢ diminishing as the order of ¢ increases. 
All the contributions can be estimated, and one can obtain solutions to Waring’s 
problem (which was to show that for every k, there is an integer m such that 
Nim(n) > O for all n) and other additive problems. For details of this method see 
Ayoub (1963). We mention here that for technical reasons, one normally works 
with generating functions of the form G,,(z)”, where 


Ia 
G,(z) = > ay 2 (12.50) 

h..0 
(so that the generating function depends on 7), and analyzes them for |z| = 1 (since 


they are now polynomials), but the basic explanation above of why this process 
works still applies. 
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13. Multivariate generating functions 


A major difficulty in estimating the coefficients of multivariate generating functions 
is that the geometry of the problem is far more difficult. It is harder to see what are 
the critical regions where the behavior of the function determines the asymptotics 
of the coefficients, and those regions are more complicated. Singularities and zeros 
are no longer isolated, as in the one-dimensional case, but instead form (k — 1)- 
dimensional manifolds in k variables. Even rational multivariate functions are not 
easy to deal with. 

One basic tool in one-dimensional complex analysis is the residue theorem, 
which allows one to move a contour of integration past a pole of the integrand. 
(We derived a form of the residue theorem in section 10, in the discussion of 
poles of generating functions.) There is an impressive generalization by Leray 
(Aizenberg and Yuzhakov 1983, Leray 1959) of this theory to several dimensions. 
Unfortunately, it is complicated, and with few exceptions [such as that of Lichtin 
(1991), see also Bertozzi and McKenna (1993)]} so far it has not been applied 
successfully to enumeration problems. On the other hand, there are some much 
simpler tools that can frequently be used to good effect. 

An important tool in asymptotics of multivariate generating functions is the 


multidimensional saddle point method. 
Example 13.1 (Alternating sums of powers of binomial coefficients). Consider 


2n 


S(s,n) = $0(-1)*" Ga) ; (13.1) 


k=0 


where s and 7 are positive integers. It has been known for a long time that S(1,”) = 
0, S(2,n) = (2n)!(n!)-?, SB, n) = (3n)!(n!) 3. However, no formula of this type has 
been known for s > 3. De Bruijn (1958, chapter 4) showed that S(s,n) for integer 
s > 3 cannot be expressed as a ratio of products of factorials. Although his proof 
is not presented as an application of the multidimensional saddle point method, it 
is easy to translate it into those terms. S(s,n) is easily seen to equal the constant 
term in 


F(21,..5%s-1) = (-1)"(1 + 24) Lt 25-1) (l= (Za Zsa!) , 
(13.2) 


and so 


S(s,n) = (2ai)°*" fo feu. sees" vezi dai se dz.-4 > 
(13.3) 


where the integral is taken with each z; traversing a circle, say. De Bruijn’s proof 
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in effect shows that for s fixed and n — oo, there are two saddle points at z; = 
++ = 2,1 = exp(2ia), with a = +(2s)~', and this leads to the estimate 


if knee ar (1=s)/2..-1/2 
S(s,n) ~ {2cos (5") }  2S(qn)O-M2s? asn +00, (13.4) 


valid for any fixed integer s > 2. Since cos(a(2s)~'). is algebraic but irrational for 
s > 4, the asymptotic estimate (13.4) shows that S(s,n) cannot be expressed as a 
ratio of finite products of (a;7)! for any fixed finite set of integers a,. 

De Bruijn (1958, chapter 6) derives the asymptotics of S(s,n) as n — oo for 
general real s. The approach sketched above no longer applies, and de Bruijn uses 
the integral representation 


wif (a POU) te 
sin)= f (aerate) 2isintz ” 


where C is a simple closed curve that contains the points —n,—n+1,...,—1,0, 
1,..., in its interior and has no other integer points on the real axis in its closure. 
A complicated combination of analytic techniques, including the one-dimensional 
saddle point method, then leads to the final asymptotic estimate of S(s, 1). oa 


The multidimensional saddle point method works best when applied to large 
singularities. Just as for the basic one-dimensional method, it does not work when 
applied to small singularities, such as those‘of rational functions. Fortunately, there 
is a trick that often succeeds in converting a small singularity in n dimensions 
into a large one in n — 1 dimensions. The main idea is to expand the generating 
function with respect to one of the variables through partial-fraction expansions or 
other methods. It is hard to write down a general theorem, but the next example 
iltustrates this technique. 


Example 13.2 (Alignments of k sequences). Let f(k,n) denote the number of k x 
m matrices of 0’s and 1’s such that each column sum is > 1 and each row sum is 
exactly n. (The number of columns, m, can vary, although obviously k < m < kn.) 
We consider k fixed, n — 00 (Griggs et al. 1990). If we let N(r1,...,7,,) denote the 
number of 0,1 matrices with k rows, no columns of all 0’s, and row sums r,..., 1x, 
then it is easy to see (Griggs et al. 1990) that 


-1 


Thy-esT 2O 


k 
F(21,-.%e) = D> Nye areata = | 2-]]a+z) 
i=1 
(13.5) 
We have f(k,n) = N(n,...,n), and so we need the diagonal terms of F = 
F(21,...,2,). The function F is rational, so its singularity is small. Moreover, the 


singularities of F are difficult to visualize. However, in any single variable F is 
simple. We take advantage of this feature. Let 


k-1 


A(z) = [[+z,), (13.6) 


j=l 
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where z stands for (z1,...,2,-.1) € CX~', and expand 


k : A(z) zi" 
2~[Ja+z2)] =@-A@ +e)! “he 


I A(z))ym! 7 
(13.7) 
Therefore 
ot ff AG den dens 
N(r), +++) k-1M) = (2mri)k-1 / l< A(z)y"*!_z ja ae . 
(13.8) 


The function whose coefficients we are trying to extract is now A(z)"/(2 — 
A(z))"*', which is still rational. However, the interesting case for us is m — 00, 
which transforms the singularity into a large one. We are interested in the case 
ry =o =-++ =r, =r =n. Then the integral in (13.8) can be shown to have a 


saddle point at z; = p, 1 <j <k —1, where p = 2'/ — 1, and one obtains the es- 
timate (Griggs et al. 1990) 


f(k,n) = Pig EW (gg ED/24 12) -1)/2K) O(n-")} asin 0. 
(13.9) 
& 


‘The examples above of applications of the multidimensional saddle point method 
all dealt with problems in a fixed dimension as various other parameters increase. 
A much more challenging problem is to apply this method when the dimension 
varies. A noteworthy case where this has been done successfully is the asymptotic 


enumeration of graphs with a given degree sequence by McKay and Wormald 
(1990). 


Example 13.3 (Simple labeled graphs of high degree). Let G(n;d,,...,dy) be the 
number of labeled simple graphs on n vertices with degree sequence d,d>,..., dn. 
Then G(n;d),...,d,)_is the coefficient of ra 20 ---24 in 


f= [[C +22), (13.10) 
jks 
[<k 


and so by Cauchy’s theorem 


G(n; d,,.. .,dy) = = (2a) [- sf Ree : Zz, 1 dz,-- -dzn, 
(13.11) 
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where each integral is on a circle centered at the origin. Let all the radii be equal to 
some r > 0. The integrand takes on its maximum absolute value on the product of 
these circles at precisely the two points 2} = 72 =---= Zn =randz;=%=---= 
Zn = —r. If dy =d) =---=d,, So that we consider only regular graphs, McKay 
and Wormald (1990) show that for an appropriate choice of the radius 7, these 
two points are saddle points of the integrand, and succeed through careful analysis 
in proving that if dn is even, and min(d,n — d — 1) > cn(logn)—! for some c > 2/3, 
then 


G(n,d,d,...,d) = 2'/?(2anat"(1 — ayn 4) 2? 


x exp (=a i O(n-*)) (13.12) 


T2A(1—A) 


as n — oo for any Z < min(1/4, 1/2 — 1/(3c)), where A = d/(n — 1). 
McKay and Wormald (1990) also succeed in estimating the number of irregular 


graphs, provided that all the degrees d, are close to a fixed d that satisfies conditions . . 


similar to those above. The proof is more challenging because different radii are 
used for different variables and the result is complicated to state. X 


The McKay—Wormald estimate of Example 13.3 is a true tour de force. The 
problem is that the number of variables is n and so grows rapidly, whereas the 
integrand grows only like exp(cn’) at its peak. More precisely, after transformations 
that remove obvious symmetries are applied the integrand near the saddle point 
drops off like exp(—n >> 67). This is just barely to allow the saddle point method 
to work, and the symmetries in the problem are exploited to push the estimates 
through. This approach can be applied to other problems (cf. McKay 1990), but 
it is hard to do. On the other hand, when the number of variables grows more 
slowly, multidimensional saddle-point contributions can be estimated without much 
trouble. 

So far this section has been devoted primarily to multivariate functions with large 
singularities. However, there is also an extensive literature on small singularities. 
The main thread connccting most of these works is that of central and local limit 
theorems. Bender (1973) initiated this development in the setting of two-variable 
problems. We present some of his results, since they are simpler than the later and 
more general ones that will be mentioned at the end of this section. 

Consider a double sequence of numbers a,,, > 0. (Usually the a, , are # 0 only 
for 0 << k <x.) We will assume that 


An = )_ ang < 00 (13.13) 
k 


for all m, and define the normalized double sequence 


Pn(k) = nx /An (13.14) 
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We will say that a,,, satisfies a central limit theorem if there exist functions a, and 
Jin such that 


lim sup Ms Pa(k) ~ (2a)~'/2 f exp(—t?/2)dt| =0. (13.15) 


kK Sonxttin 


Equivalently, p,(k) is asymptotically normal with mean jy, and variance 7. 


Theorem 13.4 (Bender 1973). Let a,, > 0, and set 


f(z,w) = So angz"w* . (13.16) 


nk>0 


Suppose that there are (i) a function g(s) that is continuous and #0 near s = 0, 
(ii) a function r(s) with bounded third derivative near s = 0, (iii) an integer m > 0, 
and (iv) ¢,6 > 0 such that 


_ 8@) _ 
(1- vant ieee ©) Tair) (13.17) 


is analytic and bounded for 


lz]}<e, lz <[r(O)| +6. (13.18) 
Let 
p= —r'(0)/r(0), oa? = p? — r"(0)/r(0) . (13.19) 
If o £0, then (13.15) holds with pp», = np and 02 = no?. 
A central limit theorem is useful, but it only gives information about the cumu- 


lative sums of the a,,,. It is much better to have estimates for the individual a,x. 
We say that p,(k) (and a,,,) satisfy a local limit theorem if 


lim sup |OnPn([Onx + fal) — (20)~'/? exp(—x?/2)| =0. (13.20) 
Ht 900 x 


In general, we cannot derive (13.20) from (13.15) without some additional condi- 
tions on the a,,,, such as unimodality (see Bender 1973). The other approach one 
can take is to derive (13.20) from conditions on the generating function f(z, w). 


Theorem 13.5 (Bender 1973). Suppose that a, > 0, and let f(z,w) be defined by 
(13.16). Let —co <a < b < co. Define 


R(e) = {z: a< Re(z) <b, |Im(z)| < ¢}. (13.21) 


Suppose there exist « > 0, 5 > 0, an integer m > 0, and function g(s) and r(s) such 
that 


(i) g(s) is continuous and # 0 for s € R(e), 
(ii) r(s) 40 and has a bounded third derivative for s € R(e), 
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(iii) for s € R(e) and |z| < ir(s)(1 +8), the function defined by (13.17) is ana- 
lytic and bounded, 


r(a)\’ , r"(a) 
(iv) (2) ce (13.22) 
(v) f(z, e*) is analytic and bounded for 


\z| < |r(Re(s))|1 +6) and s <|Im(s)| <7. 


Then : 
n™ e~*"g(a) 
Ank ~ mir(a)"oa (270)? as in — oO (13.23) 


uniformly for a < a < b, where 


kK. r(a) 

rare (13.24) 
k\? r"(a) 

a= () =a: (13.25) 


There have been many further developments of central and local limit theorems 
for asymptotic enumeration since Bender’s original work (1973). Currently the 
most powerful and general results are those of Gao and Richmond (1992). They 
apply to general multivariate problems, not only two-variable ones. Other papers 
that deal with central and local limit theorems or other multivariate problems 
with small singularities are Bender and Richmond (1983), Bender et al. (1983), 


Canfield (1977), Drmota (1994), Flajolet and Soria (1990, 1993), Gutjahr (1992), 
and Kirschenhofer (1987). 


14. Mellin and other integral transforms 


When the best generating function that one can obtain is an infinite sum, integral 
transforms can sometimes help. There is a large variety of integral transforms, 
such as those of Fourier and Laplace. The one that is most commonly used in 
asymptotic enumeration and analysis of algorithms is the Mellin transform, and it 
is the only one we will discuss extensively below. The other transforms do occur, 
though. For example, if f(x) = }° a,x"/n! is an exponential generating function of 
the sequence a,, then the ordinary generating function of a, can be derived from 
it using the Laplace transform 


[sernerp-ayax =D anyon! [xt expla) ar + 


=> any” . 
n 


(14.1) 


1192 A.M. Odlyzko 


(This assumes that the a, are small enough to assure the integrals above converge 
and the interchange of summation and integration is valid.) Related integral trans- 
forms can be used to transform generating functions into other forms. For example, 
to transform an ordinary generating function F(u) = >> a,u" into an exponential 
one, we can use 


= a F(u) exp(w/u) du . (14.2) 


The basic references for asymptotics of integral transforms are Davies (1978), 
Doetsch (1955), Oberhettinger (1974), and Sneddon (1972). This section will only 
highlight some of the main properties of Mellin transforms and illustrate how 
they are used. For a more detailed survey, especially to analysis of algorithms, see 
Flajolet et al. (1985). 

Let f(t) be a measurable function defined for real ¢ > 0. The Mellin transform 
f*(z) of f(t) is a function of the complex variable z defined by 


f= [roe de. ; (14.3) 


If f(t) = OC") as t + 0* and f(t) = O(¢8) as ¢ — oo, then the integral in (14.3) 
converges and defines f*(z) to be an analytic function inside the “fundamental 
domain” —a < Re(z) < —B. As an example, for f(t) = exp(—t), we have f*(z) = 
T(z) and a = 0, B = --oo. There is an inversion formula for Mellin transforms 
which states that 


flO = 55 [ OY plat dz , (14.4) 


and the integral is over the vertical line with Re(z) =c. The inversion formula 
(14.4) is valid for —a < c < —B, but much of its strength in applications comes 
from the ability to shift the contour of integration into wider domains to which 
f*(z) can be analytically continued. 

The advantage of the Mellin transform is due largely to a simple property, namely 
that if g(t) = af(bx) for b real, b > 0, then 


g°(z) = ab *f*(z) - (14.5) 
This readily extends to show that if 
F(.) = So cfr) (14.6) 
k 


(where the A, and 7 > 0 are such that the sum converges and F(t) is well be- 
haved), then 


F@=(D mag FC) 14.7) 
k 
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In particular, if 


F(t) = S~ f(kt), (14.8)" 
k=1. Z . 
then ‘ 
F*(z) = » ) f°) =F), (14.9) 
k=1 


where ¢(z) is the Riemann zeta function. 


Example 14.1 (Runs of heads in coin tosses). What is Rn, the expected length of 
the longest run of heads in nz tosses of a fair coin? Let p(n, k) be the probability 
that there is no run of k heads in a coin tosses. Then 


Rn = 5 k(p(n, k +1) — p(n, k)) . (14.10) 

k=1 
We now apply the estimates of Example 9.3. To determine p(n, k), we take A = ~ 
00---0, and then Ca(z) = z&-! + zk-24+---+241, so C4(1/2) =1—27*. Hence 
(9.19) shows easily that in the important ranges where k is of order logn, we have 


p(n, k) & exp(—n2“*) , (14.11) 
and there R,, is approximated well by 
r(n) = J k(exp(—n2-*) — exp(—n2-*)) . (14.12) 
k=0 


The function r(t) is of the form (14.6) with 


Ag =k, mH = 2%, f(t) = exp(—t/2) — exp(—t) , (14.13) 
is easily seen to be well behaved, and so for —1 < Re(z) < 0, 
oO 
r(z)= (x wn) f@) =20-2)°F@). (14.14) 
k=0 


Next, to determine f*(z), we note that for Re(z) > 0 we have 
OO oO fo) 
PQ)= [ fet dt = [ eA) ay — / ete at 
0 0 0 


= (2? ~1)F{z) . , (14.15) 


By analytic continuation this relation holds for —1 < Re(z), and we find that for 
—1 < Re(z) <0, 


r*(z) = 22(2% — 1) (2) . (14.16) 
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We now apply the inversion formula to obtain 


—1/2+i00 
r(t) = a i 22(2? — 1)! (z)t-2 dz . (14.17) 
2m J /2-ico 

The integrand is a meromorphic function in the whole complex plane that drops off 
. rapidly on any vertical line. We move the contour of integration to the line Re(z) = 

1. The new integral is O(¢~'), and the residues at the poles (all on Re(z) = 0) will 
give the main contribution to r(t). There are first order poles at z = 27imlog2 
for m € Z\ {0} coming from 2? = 1, and a single second order pole at z = 0, since 
I'(z) has a first order pole there as well. A short computation of the residues gives 


r(t) = log,t — > (log 2)~' P'(—2aih (log 2)-') exp(2 ih log, t) + O(t-!) . 
he. ow 


(14.18) 
rs 


There are other ways to obtain the same expansion (14.18) for r(t) (cf. Guibas 
and Odlyzko 1980). The periodic oscillating component in r(t) is common in prob- 
lems involving recurrences over powers of 2. This happens, for example, in studies 
of register allocation and digital trees (Flajolet et al. 1979, Flajolet and Rich- 
mond 1992, Flajolet and Sedgewick 1986). The periodic function is almost always 
the same as the one in eq. (14.18), even when the combinatorics of the problem 
varies. Technically this is easy to explain, because of the closely related recurrences 
leading to similar Mellin transforms for the generating functions. 

Mellin transforms are useful in dealing with problems that combine combinato- 
rial and arithmetic aspects. For example, if S(m) denotes the total number of 1’s 
in the binary representations of 1,2,...,/ — 1, then it was shown by Delange that 


S(n) = srtlog, n +nu(log,n)+0(n) asn—oo, (14.19) 


where u(x) is a continuous, nowhere-differentiable function that satisfies u(x) = 
u(x + 1). The Fourier coefficients of u(x) are known explicitly. Perhaps the best 
way to obtain these results is by using Mellin transforms. See Flajolet et al. (1994) 
and Stolarsky (1977) for further information and references. 

Mellin transforms are often combined with other techniques. For example, sums 
of the form s, = }°4,(7) with oscillating a, lead to generating functions 


s(z) = > ayw(z)t . (14.20) 
k 


The asymptotic behavior of s(z) near its dominant singularity can sometimes be 
determined by applying Mellin transforms. For a detailed explanation of the ap- 
proach, see Flajolet et al. (1985). Examples of the application of this technique 
can be found in Andrews (1976) and Meinardus (1954). 
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15. Functional equations, recurrences, and combinations of methods 


Most asymptotic enumeration results are obtained from combinations of tech- 
niques presented in the previous sections. However, it is only rarely that the basic 
asymptotic techniques can-be applied directly.: This section describes a variety of 
methods and results that are not easy to categorize. They use combinations of 
methods that have been presented before, and sometimes develop them further. 
In most of the examples that will be presented, some relations for generating 
functions are available, but no simple closed-form formulas, and the problem is 
to deduce where the singularities lie and how the generating functions behave in 
their neighborhoods. Once that task is done, the previous methods can be applied 
to obtain asymptotics of the coefficients. 


15.1. Implicit functions, graphical enumeration, and related topics 


Example 15.1 (Rooted unlabeled trees), We sketch a proof that T,, the number of 
rooted unlabeled trees with n vertices, satisfies the asymptotic relation (1.9). The 
functional equation (1.8) holds with T(z) regarded as a formal power series. The 
first step is to show that 7(z) is analytic in a neighborhood of 0. This can be done 
by working exclusively with eq. (1.8). [There is an argument of this type in Harary 
and Palmer (1973, section 9.5).] Another way to prove analyticity of T(z) is to 
use combinatorics to obtain crude upper bounds for T,,. We use a combination of 
these approaches. If a tree with m > 2 vertices has at least two subtrees at the root, 
we can decompose it into two trees, the first consisting of one subtree at the root, 
the other of the root and the remaining subtrees. This shows that 


n-1 
Ta & Toit > Trebek, 22. (15.1) 
k=1 


Therefore, if we define a; = 1, and 


a-l 


Gy = 4, 1+ > yay, , n>2, (15.2) 
k=1 


then we have 7, < a,. Now if 
co 
A(z) = So anz" , 
n=1 


then the defining relation (15.2) yields the functional equation 
A(z) ~ z = zA(z) + A(z? , (15.3) 
so that 


A(z) = (1—z- (1 ~ 62 +2)'/7)/2 . (15.4) 
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Since A(z) is analytic in |z| <3 — 2V2 = 0.17157... , we have 
O< Tn < aq = O(6"). (15.5) 


It will also be convenient to have an exponential lower bound for T,,. Let b, 
be the number of rooted unlabeled trees in which every internal vertex has < 2 
subtrees. Then b; = 1, b) = 1, and 


Ln—1)/2) 
bn > SY) dybar1 forn>3. (15.6) 
k=1 


We use this to show that b, > (6/5)" for 2 > 7. Direct computation establishes this 
lower bound for 7 < nm < 14, and for n > 15 we use induction and b, > b,b,_4-} 
with k = |(n — 1)/2}. 

Since T,, > bn > (6/5)", T(z) converges only in |z| <r for some r with r < 1. 
Since T (0) = 0, |T(z)| < Cs[z| in |z| < r — 6 for every 5 > 0, and therefore 


u(z) = y T (2k) fk (15.7) 
k=2 


is analytic in |[z{ < r'/*, and in particular at z = r. Therefore, although we know 
little about r and u(z), we see that T(z) satisfies G(z, T(z)) = T(z), where 


G(z,w) = zexp(w + u(z)) (15.8). 


is analytic in z and w for all w and for |z| < r'/2. 


We will apply Theorem 10.13. First, though, we need to establish additional 
properties of T(z). We have 


T(z) exp(—T(z)) = zexp(u(z)) > rexp(u(r)) asz>r , (15.9) 


and 0 < rexp(u(r)) < co. Since T(z) is positive and increasing for 0 < z <r, T(r), 
the limit of T(z) as z > r~, must exist and be finite. 
We next show that 7(r) = 1. We have 


2Gl, w) = G(z,w). (15.10) 


We know that G(z, T(z)) = T(z) for (z| < r, and in particular for some z arbitrarily 
close to r. If T(r) 4 1, then by (15.10) 


fs] 
—~—(G(z, w) — w) #0 (15.11) 
Ow ETE) 
in a neighborhood of z = r, and therefore T(z) could be continued analytically to 
a neighborhood of z = r. This is impossible, since r is the radius of convergence 


of T(z), and T, > 0 implies by Theorem 10.4 that T(z) has a singularity at z =r. 
Therefore we must have T(r) = 1, and G,(r, T(r)) = 1. 
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We have now shown that conditions (i) and (ii) of Theorem 10.13 hold with the r 
of that theorem the same. as the r we have defined and s = T(r) = 1,6 =r'/ —r. 
Condition (iii) is easy to verify. Finally, the conditions on the coefficients of T(z) 
and G(z,w) are clearly satisfied. 

Since Theorem 10.13 applies, we do obtain an asymptotic expansion for T,, of the 
form (1.9), with C given by the formula (10.64). It still remains to determine r and 
C. No closed-form expressions are known for these constants.-They are conjectured 
to be transcendental and algebraically independent of standard constants such as 7 


and e, but no proof is available. Numerically, however, they are simple to compute. 
Note that 


G(r, 1) = exp(it+u(r))\(h+ru'(r)) =r 4u'(r) , (15.12) 
Gow(r,t)=1, (15.13) 


so we only need to compute r and u'(r). These quantities can be computed along 
with u(r) in the same procedure. The basic numerical procedure is to determine r as 
the positive solution to T(r) = 1. To determine T(x) for any positive x, we take any 
approximation to the T(x*), k > 1 (starting initially with x* as an approximation 
to T(x*), say), and combine it with (1.8) (applied with z = x”, m > 1) to obtain 
improved approximations. This procedure can be made rigorous. Upper bounds 
for r, u(r), and u'(r) are especially easy. Since 7, = 1, T(x) > x forO<x<1, 
and therefore, T(x*) > x* for k > 1. Suppose that we start with a fixed value of x 
and derive some lower bounds of the form T(x*) > ut!) > 0 for k > 1. Then the 
functional equation (1.8) implies 


oo 
T(x) > u® = x exp (>: mit) , mp1. (15.14) 
k=1 


This process can be iterated several more times, and to keep the computation 


manageable, we can always set ul) = 0 for k > kp. If we ever find a lower bound 
T(x) > 1 by this process, then we know that r < x, since T(r) = 1. Lower bounds 
for r are slightly more complicated. & 


We mention here that if U, denotes the number of unlabeled trees, then the 
ordinary generating function U(z) = }> Unz" satisfies 


U(z) = T(z) — T(z)? /2 + T(z’)/2 . (15.15) 


Using the results from Example 15.1 about the analytic behavior of T(z), it can 
be shown that 


Un~ Cr tn? , (15.16) 
where r = 0.3383219...is the same as before, while C’ = 0.53494852.. 


Example 15.2 (Leftist trees). Let a, denote the number of leftist trees of size n 
li.e., rooted planar trees with n leaves, such that in any subtree S, the leaf nearest 
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to the root of S is in the right subtree of S (Knuth 1973b)]. Then a, = a2 = a, = 1, 
a4 = 2, as = 4. No explicit formula for a, is known. Even the recurrences for the 
a, are complicated, and involve auxiliary sequences. If 


f(z) = do ane" (15.17) 
n=! 


denotes the ordinary generating function of a,, then the combinatorially derived 
recurrences for the a, show that (Kemp 1987) 


f(z) =z+ fle) + ; Sal)? 5 (15.18) 
m=) 


where the auxiliary generating functions g,,(z) (which enumerate leftist trees with 
the leftmost leaf at distance m — 1 from the root) satisfy 


m-1 

gi(z) = 2, go(z) = zf(z), Bm (z) = Bm(z) [ F(z) — Do ai(z)], m>2, 
j=1 

(15.19) 


and 
f(z) = So gm(z) . (15.20) 
m=1 


These generating-function relations might not seem promising. If r is the small- 
est singularity of f(z), then 5° gn(z)* is not analytic at r, so we cannot apply 
Theorem 10.13 in the way it was used in Example 15.1. However, Kemp (1987) 
has sketched a proof that the analytic behavior of f(z) is of the same type as 
that involved in functions covered by Theorem 10.13, so that it has a dominant 
square-root singularity, and therefore 


a, = ac'n 3? 4 O(n) , (15.21) 
where 
a = 0.250363429..., c= 2.749487902.... (15.22) 


The constants a and c are not known explicitly in terms of other standard numbers 
such as 7 or e, but they can be computed efficiently. The ac'n~?/ term in (15.21) 
gives an approximation to a, that is accurate to within 4% for m = 10, and within 
0.4% for n = 100. Thus asymptotic methods yield an approximation to a, which is 
factory for many applications. Further results about leftist trees can be found 
»mp (1990). sd 
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15.2. Nonlinear iteration and tree parameters 


Example 153 (Heights of binary trees). A binary tree (Knuth 1973b) is a rooted 
tree with unlabeled nodes, in which each node has.0 or 2 successors, and left | 
and right successors are distinguished. The size of a binary tree is the number of 
internal nodes, i.e., the number of-nodes with two successors. We let B, denote the 
number of binary trees of size n, so that By = 1 (by convention), B; = 1, B, = 2, 
B; =5,.... Let 


B(z) = 3 B,z" . (15.23) 


n=0 
Since each nonempty binary tree consists of the root and two binary trees (the left 
and right subtrees), we obtain the functional equation 
B(z) =1+zB(z)’. (15.24) 
This implies that 


pee) = 10802 (525 


so that 


1 f2n 
By = aa ( 3) ; (15.26) 


and the B, are the Catalan numbers. The formula (4.4) {easily derivable from 
Stirling’s formula (4.1)] shows that 


By wn 3/24" asin — oo. (15.27) 

The height of a binary tree is the number of nodes along the longest path from 
the root to a leaf. The distribution of heights in binary trees of a given size does 
not have exact formulas like that of (15.26) for the number of binary trees of a 
given size. There are several problems on heights that have been answered only 
asymptotically, and with varying degrees of success. The most versatile approach 
is through recurrences on generating functions. Let B,,, be the number of binary 
trees of size n and height < h, and let 


co 
biz) = D> Bayz" - (15.28) 
n=0 : 


Then 
bo(z) =0, bi(z)=1, 4 (15.29) 
and an extension of the argument that led to the relation (15.24) yields 


byst(Z) =1+2b,(2)? , h>o. (15.30) 
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The b,(z) are polynomials in z of degree 2"~' — 1 for A > 1. Unfortunately there 
is no simple formula for them like eq. (15.25) for B(z), and one has to work with 
the recurrence (15.30) to obtain many of the results about heights of binary trees. 
Different problems involve study of the recurrence in different ranges of values of 
z, and the behavior of the recurrence varies drastically. 

For any fixed z with |z| < 1/4, b,(z) — B(z) as h — oo. For |z| > 1/4 the be- 
havior of b,(z) is more complicated, and is a subject of of nonlinear dynamics 
(Devaney 1989). (It is closely related to the study of the Mandelbrot set.) For any 
real z with z > 1/4, b,(z) — oo as h — oo. To study the distribution of the By, 
as n varies for h fixed, but large, it is necessary to investigate this range of rapid 
growth. It can be shown (Flajolet and Odlyzko 1984) that for any A; and A, with 
O< A, <A2 < 1/2, 


2, = 2xP2""B(r) ~ FB") log) 


= 1+0(2-4/ 5.31 
Qh- D2Qn(r2p"(r) +B") = ( )) (1 3 ) 
uniformly as h,n — oo with 
Ay <n/2" <da2, (15.32) 


where the function B(x) is defined for 1/4 < x < oo by 


B(x) = logx + $ > 2 log (1 + saci) ; (15.33) 


j=l 


and r is the unique solution in (1/4, 00) to 
rBl(r) =n2 ” (15.34) 


The formula (15.31) might appear circular, in that it describes the behavior 
of the coefficients 8,,, of the polynomial b,(z) in terms of the function B(z), 
which is defined by b,(z) and all the other b;(z). However, the series (15.33) for 
B(z) converges rapidly, so that only the first few of the b,(z) matter in obtaining 
approximate answers, and computation using (15.33) is efficient. The function B(z) 
is analytic in a region containing the real half-line x > 1/4, so the behavior of the 
Brn is smooth. It is also known (Flajolet and Odlyzko 1984) that the behavior 
of B,,,, a8 a function of n is Gaussian near the peak, which occurs at n ~ 2/~!. 
0.628 968 . ... The distribution of B,,,, is not Gaussian throughout the range (15.32), 
though. 

The proof of the estimate (15.31) is derived from the estimate 


b,(z) = exp(2""'B(z) — log z)(1 + O(exp(—e2"))) , (15.35) 


valid in a region along the half-axis x > 1/4. The estimates for the coefficients 
B,,, are obtained by applying the saddle point method. Because of the doubly- 
exponential rate of growth of 5,(z) for z close to the real axis, it is easy to show 
that on the circle of integration, the region away from the real axis contributes 
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a negligible amount to B,,,. The relation (15.35) is sufficient, together with the 
smoothness properties of B(z), to estimate the contribution of the integral near 
the real axis. To prove (15.35), one proceeds as in Example 9.8. However, greater 
care is required because of the complex variables that occur and the need for 


estimates that are uniform in the variables. The basic recurrence (15.30) shows 
that : 


log bys1(z) = 2 log b,(z) + log z + log (1 * 2 aor) 


; (15.36) 
= 2log b,(z) + logz + lo (+553) : 
g 5;,(z) + log 2 Brale) 1 
Iterating this relation, we find that for h > 1, 
hel 1 
log by.\(Z) = 2"*! log by (z) + (28 — 1) logz + 4° 2* to (1+;—) 
8 On i( ) 14 1( ) ( ) 4 » 2 brsi_e(Z) — 1 
Att 1 
7h -j ee a 
2 ae log (1+ 55-3) logz . 
(15.37) 


The basic equation (15.35) then follows. The technical difficulty is in establishing 
rigorous bounds for the error terms in the approximations. Details are presented 
in Flajolet and Odlyzko (1984). 

Most of the binary trees of a given height h are large, with about 0.3 - 2" internal 
nodes. This might give the misleading impression that most binary trees are close 
to the full binary tree of a similar size. However, if we consider all binary trees of a 
given size n, the average height is on the order of n'/2, so that they are far from the 
full balanced binary trees. The methods that are used to study the average height 
are different from those used for trees of a fixed height. The basic approach of 
Flajolet and Odlyzko (1984) is to let 


H, = )>bt(T) , 
T 
\T\=a 


where the sum is over the binary trees T of size n, and ht(T) is the height of T. 
Then the average height is just H,/Bn. 
The generating function for the H,, is 


H(z) = > Haz" = Y(B(2) — bx(2)) ; * 45.38) 
a—0 hzo 


and the analysis of Flajolet and Odlyzko (1984) proceeds by investigating the 
behavior of H(z) in a wedge-shaped region of the type encountered in section 11.1. 
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If we let 


e(z) = (1 —4z)'”, (15.39) 
en(z) = (B(z) — ba(z))/2B(z)) (15.40) 
then the recurrence (15.30) yields 


€nei(Z) = (1 — e(z))en(z)(1 — en(z)) , eo(z) = 1/2. (15.41) 
Extensive analysis of this relation yields an approximation to e,(z) of the form 


e(z)(1 ~ e(z))" 


€n(Z) = t-te)’ (15.42) 


valid for |e(z)| sufficiently small, |Arg e(z)| < 7/4 + 6 for a fixed 5 > 0. [The pre- 
cise error terms in this approximation are complicated, and are given in Flajo- 
let and Odlyzko (1984).] This then leads to an expansion for H(z) in a sector 
|z — 1/4] < a, w/2 — B < |Arg(z — 1/4)| < w/2+ B of the form 


H(z) = —2log(1 — 4z) + K + O(|1 — 4z]”) , (15.43) 


where vu is any constant, v < 1/4, and K is a fixed constant. Transfer theorems of 
section 11.1 now yield the asymptotic estimate 


H,~2n7'4" asn— oo. (15.44) 


When we combine (15.44) with (15.27), we obtain the desired result that the av- 
erage height of a binary tree of size n is ~ 2(mn)!/? as n — oo. 

Distribution results about heights of binary trees can be obtained by investigating 
the generating functions 


doh (B(z) — baz) - (15.45) 


h20 


This procedure, carried out in Flajolet and Odlyzko (1984) by using modifications 
of the approach sketched above for the average height, obtains asymptotics of 
the moments of heights. The method mentioned in section 6.5 then leads to a 
determination of the distribution. However, the resulting estimates do not say much 
about heights far away from the mean. A more careful analysis of the behavior of 
e,(z) can be used Flajolet et al. (1993a) to show that if x = h/(2n'/?), then 


Ban — Brin 


n 


~ xn PS? (2m?x? — 3)e-™" (15.46) 
m=1 
as n,h — oo, uniformly for x = o((logn)'/?), x~! = o((logn)!/”). 


For extremely small and large heights, different methods are used. It follows 
from. Flajolet et al. (1993a) that 


Pha — Becta ee Aie nD) (15.47) 
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for a constant c.>.0, which shows that extreme heights are infrequent. (The es- 
timates in Flajolet et al. (1993a) are more precise than (15.47).] Bounds of the 
above form for small heights are obtained in Flajolet et al. (1993a) by studying 
the behavior of the b,(z) almost on the boundary between convergence and di- 
vergence, using the methods of Wright et al. (1986). Let x, be the unique positive ~ 
root of b,(z) = 2. Note that B(1/4) = 2, and each coefficient of the b,(z) is non- 
decreasing as h — oo. Therefore x2 > x3 > --- > 1/4. More effort shows (Flajolet 
et al. 1993a) that x, is approximately 1/4 + a@h-? for a certain a > 0. This leads 
to an upper bound for B,,, by Lemma 8.4. Bounds for trees of large heights are 
even easier to obtain, since they only involve upper bounds for the b,(z) — b,_,(z) 
inside the disk of convergence |z| < 1/4. & 


In addition to the methods of Flajolet and Odlyzko (1982, 1984) and Flajolet 
et al. (1993a) that were mentioned above, there are also other techniques for 
studying heights of trees, such as those of Brown and Shubert (1984) and Rényi 
and Szekeres (1967). However, there are problems about obtaining fully rigorous 
proofs that way. [See the remarks in Flajolet et al. (1993a) on this topic.] Most 
of these methods can be extended to study related problems, such as those of 
diameters of trees (Szekeres 1982). 

The results of Example 15.3 can be extended to other families of trees (cf. 
Flajolet and Odlyzko 1982, 1984, Flajolet et al. 1993a). What matters in obtaining 
results such as those of the above example are the form of the recurrences, and 
especially the positivity of the coefficients. 


Example 15.4 (Enumeration of 2,3-trees (Odlyzko 1982)). Height-balanced trees 
satisfy different functional equations than unrestricted trees, which results in differ- 
ent analytic behavior of the generating functions, and different asymptotics. Con- 
sider 2, 3-trees; i.e., rooted, oriented trees such that each nonleaf node has either ‘ 
two or three successors, and in which all root-to-leaf paths have the same length. 
If a, is the number of 2,3-trees with exactly n leaves, then a; = a2 = a3 = a4 = 1, 
as = 2,..., and the generating function 


fq) = Sanz" (15.48) 
n=1 
satisfies the functional equation 
f(z) =z4 fle? +2). (15.49) 
Iteration of the recurrence (15.49) leads to 
f(z) = 55 (2), . (15.50) 
k-0 


where Q(z) =z, Qxii(z) = O,(z2 + 23), provided the series (15.50) converges. 
The Taylor series (15.48) converges only in |z| < ¢~', where @ = (1 +5!/2)/2 is the 
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“golden ratio”. Study of the polynomials Q,(z) shows that the expansion (15.50) 
converges in a region 


D = {z: |z|< 6 ' +5, [Arg(z — @°')| > #/2~e} (15.51) 
for certain 6, « > 0, and that inside D, 


f(z) = —clog(@ ' — z) + w(log(@~! — z)) + O(]@? ~ zl) , (15.52) 


where c = [@ log(4 — ¢)}"', and w(t) is a nonconstant function, analytic in a strip 
{Im (¢)| < 9 for some 7 > 0, such that w(t +log(4 — ¢)) = w(t). The expression 
(15.52) only has to be proved in a small vicinity of @~' (intersected with D, of 
course). Since 


O(d'!+v)=67'+(4— bv + O(\rf?) (15.53) 


(so that @~' is a repelling fixed point of Q), behavior like that of (15.52) is to 
be expected, and with additional work can be rigorously shown to hold. Once the 


expansion (15.52) is established, singularity analysis techniques can then be applied 
to deduce that 


n 
an ~ £” wlogn) asn—-oo, (15.54) 


where u(t) is a positive nonconstant continuous function that satisfies u(t) = u(t + 
log(4 — )), and has mean value (¢ log(4 — @))~'. For details, see Odlyzko (1982). 


The same methods can be applied to related families of trees, such as those of 
B-trees. & 


The results of Example 15.3 and the generalizations mentioned above all ap- 
ply only to the standard counting models, in which all trees with a fixed value of 
some simple property, such as size or height, are equally likely. Often, especially 
in computer-science applications, it is necessary to study trees produced by some 
algorithm, and consider all outputs of this algorithm as equally likely. For exam- 
ple, in sorting it is natural to consider all permutations of n elements as equally 
probable. If random permutations are used to construct binary search trees, then 
the distribution of heights will be different from that in the standard model, and 
the two trees of maximal height will have probability of 2/n! of occurring. The 
average height turns out to be ~ clog as n -> oo, for c= 4.311... , a certain 
constant given as a solution to a transcendental equation. This was shown by De- 
vroye (1986) (see also Devroye 1987) by an application of the theory of branching 
processes. For a detailed exposition of this method and other applications to sim- 
ilar problems, see Mahmoud (1992). The basic generating-function approach that - 
we have used in most of this chapter leads to functional iterations which have not 
been solved so far. 


15.3. Differential and integral equations 


Section 9.2 showed that differential equations arise naturally in analyzing linear re- 
currences of finite order with rational coefficients. There are other settings where 
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they arise even more naturally. As is true of nonlinear iterations in the previ- 
ous section and the functional equations of the next one, differential and integral 
equations are typically used to extract information about singularities of penerating 
functions. We have already seen in Example 9.4 and other cases that differential. 
equations can yield an explicit formula for the generating function, from which it 
is easy to deduce what the. singularities are and how they affect the asymptotics 
of the coefficients. Most.differential equations do not have a closed-form solution. 
However, it is often still possible to derive the necessary information about analytic 
behavior even when there is no explicit formula for the solution. We demonstrate 
this with a brief sketch of a recent analysis of this type (Flajolet and Lafforgue 
1994). Other examples can be found in Mahmoud (1992). 


Example 15.5 (Search costs in quadtrees (Flajolet and Lafforgue 1994)). Quad- 
trees are a well-known data structure for multidimensional data storage Gonnet 
and Baeza- Yates 1991). Consider a d-dimensional data space, and let n points be 
drawn independently from the uniform distribution in the d-dimensional unit cube. 
We take d fixed and n — oo. Suppose that the first 2 — 1 points have already been 
inserted into the quadtree, and let D,, be the search cost (defined as the number of 
internal nodes traversed) in inserting the nth item. The result of Flajolet and Laf- 
forgue (1994) is that D,, converges in distribution to a Gaussian law when n — oo. 
If 2, and o, denote the mean and standard deviation of D,, respectively, then 


bn ~ 2d“ logn, on ~ d-'(2logn)'/? asn— co, (15.55) 
and for all real a < B, as n — 0, 
B 
Pr(aay < Dy — fn < BOn) ~ (20)? | exp(—x?/2) dx . (15.56) 


The results for ,, and o,, had been known before, and required much simpler 
techniques for their solution, see Mahmoud (1992). It was only necessary to study 
asymptotics of ordinary differential equations in a single variable. To obtain dis- 
tribution results for search costs, it was necessary to study bivariate generating 
functions. The basic relation is 


S> Pr{Dy = kjuk = (24u — 1) (dau) — bn) 5 (15.57) 
k 
where the polynomials ¢,(u) have the bivariate generating function 
P(u,z) = > ba(u)z” (15.58) 
n=0 


which satisfies the integral equation 


tod fd fds 
B(u, z =142u f as | 2 3, 
(tn 2) o X(L—x1) Jo X2(1~ x2) Jo x3(1 — x3) 


Xd-2 dxy_) Xa dx, 
—_ P(u, x : 
f Xai. — X41) Jo \ aj — Xa 


(15.59) 


1206 A.M. Odlyzko 


This integral equation can easily be reduced to an equivalent differential equation, 
which is what is used in the analysis. For d = 1 there is an explicit solution 


P(u,z)=(1—z)y™, (15.60) 


which shows that D, can be expressed in terms of Stirling numbers. This is not 
surprising, since for d = 1 the quadtree reduces to the binary search tree, for which 
these results were known before. For d = 2, ®(u, z) can be expressed in terms of 
standard hypergeometric functions. However, for d > 3 there do not seem to be 
any explicit representations of ®(u,z). Flajolet and Lafforgue use a singularity 
perturbation method to study the behavior of ®(u, z). They start out with the dif- 
ferential system derivable in standard way from the differential equation associated 
to (15.59) (i.e., a system of d linear differential equations in z with coefficients that 
are rational in z). Since only values of u close to 1 are important for the distribu- 
tion results, they regard u as a perturbation parameter of this system. For every 
fixed u, they determine the dominant singularity of the linear differential system 
in the variable z, using the indicial equations that are standard in this setting. It 
turns out that the dominant singularity is a regular one at z = 1, and 


P(u, z) © c(u)(1 — z)7" , (15.61) 


at least for z and u close to 1. This behavior of ®(u, z) is then used (in its more 
precise form, with explicit error terms) to deduce, through the transfer theorem 
methods explained in section 11, the behavior of ¢,,(w): 


dn(u) © c(u)E(2u'/4)- p21 (15.62) 


This form, again in a more precise formulation, is then used to deduce that the 


behavior of D,, is normal near its peak, and that the tails of the distribution are 
small. Q 


15.4. Functional equations 


One area that needs and undoubtedly wili receive much more attention is that of 
complicated nonlinear relations for generating functions. Even in a single variable 
our knowledge is limited. Some of the work of Mahler (1976, 1981, 1983), de- 
voted to functions f(z) satisfying equations of the form p(f(z), f(z®)) = 0, where 
p(u,v) is a polynomial, shows that it is possible to extract information about the 
analytic behavior of f(z) near its singularities. This can then be used to study the 
coefficients. 


Sometimes seemingly complicated functional equations do have easy solutions. 


Example 15.6 (A pebbling game). In a certain pebbling game (Chung et al. 1995), 
minimal configurations of size n are counted by T,,(0), where 7;,(x) is a polynomial 
that satisfies T,,(x) = 0 for 0 <n < 2, T3(x) = 4x + 2x’, and for n > 3, 


Tat (X) = x7 '(1 + x)? T(x) — x7'T,(0) + X70) . (15.63) 


Asymptotic enumeration methods 1207 . 


The coefficients of T,,(x) are > 0, and 


Tysi(1) <4Tq(1) + Ty(1) +1 < 6Ta(1) , (15.64) 
so clearly each coefficient of T,,(x) is < 6”, say. Let 
f(x,y) = > Talx)y" | | (15.65) 
n=0 


The bound on T,,(1) shows that f(x, y) is analytic in x and y for |x| <1, ly] < 1/6, 
say, with x and y complex. Then the recurrence (15.63) leads to the functional 
equation 


(x — y(1+2x))f(x, y) = 2x7(2 + x)y? — yf(O,y) + xy fc, y) , (15.66) 


where f,(x, y) is the partial derivative of f(x, y) with respect to x. We now differ- 
entiate eq. (15.66) with respect to x and set x = 0. We find that 


(1 — 2y)f(0,y) = yfx(0,y) , (15.67) 
and therefore 

(x — y(1 +x)’)f (x,y) = 2x7(2 + xy? — [y + 2y—1)x7]fO,y). (15.68) 
When 

x=y(1+x), (15.69) 


the left side of eq. (15.68) vanishes, and eq. (15. = yields the value of f(0, y). Now 
eq. (15.69) holds for 


x = (2y) "(1 —2y + (1 —4y)!”). 
To ensure that (15.69) holds for x and y both in a neighborhood of 0, we set 
g(y) = (2y)'(1 — 2y — (1 — 4y)”) (15.70) . 


Then g(y) = y(1 +8(y)), g(y) is analytic for |y| small, and so substituting x = g(y) 
in eq. (15.68) yields 


(1 — 7y + 14y? — 9y*)f(0, y) (15.71) 
= y((1 — 4y)/7(1 — 3y + y?) — 1 + Sy — y? — 6y?) . : 
Thus f(0, y) is an algebraic function of y. Equation (15.71) was proved only for |y| 
small, but it can now be used to continue f(0, y) analytically to the entire complex 
plane with the exception of a slit from 1/4 to infinity along the positive real axis. 


There is a first order pole at y = 1/r, with r = 4.147 8990357... ; the positive root 
of 


rP—7r’+14r-9=0, (15.72) 
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and no other singularities in |y| < 1/4. Hence we obtain 
Tn (0) = [y"|fO, y) = cr” + O((4 + £)") (15.73) 


as n-+ oo, for every & > 0, where c is an algebraic number that can be given 
explicitly in terms of r. 

The value of f(0, y) is determined by eq. (15.71), and together with eq. (15.68) 
gives f(x, y) explicitly as an algebraic function of x and y. The resulting expression 
can then be used to determine other coefficients of the polynomials T,,(x). & 


Example 15.6 was easy to present because of the special structure of the func- 
tional equation. The main trick was to work on the variety defined by eq. (15.69), 
on which the main term vanishes, so that one can analyze the remaining terms. 
The same basic approach also works in more complicated situations. The analysis 
of certain double queue systems leads to two-variable generating functions for the 
equilibrium probabilities that satisfy equations such as the following one, obtained 
by specializing the problem treated in Flatto and Hahn (1984): 


Q(z, w)f(z,w) = 2z(w — 1) f(z, 0) + 3w(z — IFO, w) , (15.74) 
valid for complex z and w with |z|,|w{ < 1, where 
Q(z, w) = 6zw — 3w — 2z ~ zw’. (15.75) 


The generating function f(z, w) is analytic in z and w. What makes this problem 
tractable is that on the algebraic curve in two-dimensional complex space defined 
by Q(z,w) = 0, the quantity on the right-hand side of eq. (15.74) has to vanish, 
and this imposes stringent conditions on f(z,0) and f(0,w), which leads to their 
determination. Once f(z, 0) and f(0, w) are found, f(z, w) is defined by eq. (15.74), 
and one can determine the asymptotics of its coefficients. Treatment of functional 
equations of the type (15.74) was started by Malyshev (1972). For recent work 
and references to other papers in this area, see Flatto (1989) and Flatto and Hahn 
(1984). This approach has so far been successful only for two-variable problems 


with Q(z, w) of low degree. Moreover, the mathematics of the solution is far deeper 
than that used in Example 15.6. 


16. Other methods 


This section mentions a variety of methods that are not covered elsewhere in this 
chapter but are useful in asymptotic enumeration. Most are discussed briefty, since 


they belong to large and well-developed fields that are beyond the scope of this 
survey. 


16.1, Permanents 


Van der Waerden’s conjecture, proved by Falikman (1981) and Egorychev (1981), 
can be used to obtain lower bounds for certain enumeration problems. It states that 
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if A is an n x n matrix that is doubly stochastic (entries > 0, all row and column 
sums equal to 1) then the permanent of A satisfies per(A) > n-"n!. (For most 
asymptotic problems it is sufficient to rely on an earlier result of Bang (1976) and 
Friedland (1979) which gives a lower bound of per(A) > e™” that is worse only by 
a factor of n'/2.] There is also an upper bound for permanents. Minc’s conjecture, 
proved first by Bragman and in a simpler way by Schrijver (1978) states that an 
nxn matrix A with 0,1 entries and row sums r,,...,7n has 


per(A) < [en ‘ 
j=l 
We now show how these results can be applied. 


Example 16.1 (Latin rectangles). Suppose we are given a k x n Latin rectangle, 
k <n, so that the symbols are 1,2,...,, and no symbol appears twice in any 
tow or column. In how many ways can we extend this rectangle to a (k + 1) x 
n Latin rectangle? To get a lower bound, form an n x n matrix B = (b,j), with 
bj; = 1 if i does not appear in column j of the rectangle, and b;; = 0 otherwise. 
Then the row and column sums of B are all equal to n—k, so (n—k)"'B is 
doubly stochastic. Thercfore per(B), which cquals the desired number of ways of 
extending the rectangle, is > (n — k)"n""n! by van der Waerden’s conjecture. By 
Minc’s conjecture, we also have per(B) < ((n — k)!)"/-*), If we let L(k,n) denote 
the number of k x n Latin rectangles, then L(1,n) = n!, and the bounds derived 
above for the number of ways to extend any given rectangle give 


k-1 

L({k,n) > [le —jy'n aly =n (ntye"((n — kK) , (16.1) 
j=0 
k-1 

Lkn) <]]{@-pyye >. (16.2) 
j=0 


Sharper estimates for L(k,n) have been obtained through more powerful and 
complicated methods by Godsil and McKay (1990). They obtain an asymptotic 
relation for L(k,n) that is valid for k = o(n°/’), and improved estimates for other 
k. [It is known that for any fixed k, the sequence L(k, n) satisfies a linear recurrence 
with polynomial coefficients (Gessel 1987).] 


There are problems in which inequalities for permanents give the correct asymp- 
totic estimates. One such example is presented in Penrice (1991) which discusses 
a variation on the “probléme des rencontres”. 


16.2. Probability theory and branching process methods y 


Many combinatorial enumeration resulls can be phrased in probabilistic language, 
and a few probabilistic techniques have appeared in the preceding sections. How- 
ever, the stress throughout this chapter has been on elementary and generating- 
function approaches to asymptotic enumeration problems. Probabilistic methods 
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provide another way to approach many of these problems. This has been appre- 
ciated more in the former Soviet Union than in the West, as can be seen in the 
books Koichin (1986), Kolchin et al. (1978), and Sachkov (1978). 

The last few years have seen a great increase in the applications of probabilistic 
methods to combinatorial enumeration and analysis of algorithms. Many powerful 
tools, such as martingales, branching processes, and Brownian motion asymptotics 
have been brought to bear on this topic. General introductions and references to 
these topics can be found in chapter 33 as well as in Aldous (1989), Alon and 
Spencer (1992), Arratia and Tavaré (1992b, 1994), Barbour et al. (1992), Devroye 
(1986, 1987), Erdés and Spencer (1974), Louchard (1983, 1986), Louchard et al. 
(1992), and Mahmoud (1992). 


16.3. Statistical physics 


There is an extensive literature in mathematical physics concerned with asymptotic 
enumeration, especially in Ising models of statistical mechanics and percolation 
methods. Many of the methods are related to combinatorial enumeration. For an 
introduction to them, see chapter 37 or the books Baxter (1982) and Kesten (1982). 


16.4. Classical applied mathematics 


There are many techniques, such as the ray method and the WKB method, that 
have been developed for solving differential and integral equations in what we 
might call classical applied mathematics. An introduction to them can be found in 
Bender and Orszag (1978). They are powerful, but they have the disadvantage that 
most of them are not rigorous, since they make assumptions about the form or the 
stability of the solution that are likely to be true, but have not been established. 
Therefore we have not presented such methods in this survey. For some examples 
of the nonrigorous applications of these methods to asymptotic enumeration, see 
the papers of Knessl and Keller (1990, 1991). It is likely that with additional work, 
more of these methods will be made rigorous, which will increase their utility. 


17. Algorithmic and automated asymptotics 


Deriving asymptotic expansions often involves a substantial amount of tedious 
work. However, much of it can now be done by computer symbolic algebra sys- 
tems such as Macsyma, Maple, and Mathematica. There are many widely available 
packages that can compute Taylor-series expansions. Several can also compute 
certain types of limits, and some have implemented Gosper’s indefinite hypergeo- 
metric summation algorithm (Gosper 1978). They ease the burden of carrying out 
the necessary but uninteresting parts of asymptotic analysis. They are especially 
useful in the exploratory part of research, when looking for identities, formulating 
conjectures, or searching for counterexamples. 

Much more powerful systems are being developed. Given a sequence, there 
are algorithms that attempt to guess the generating function of that sequence 


een ene a a 
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(Bergeron and Plouffe 1992; Getu et al.-1992). It is possible to go much further 
than that. Many of the asymptotic results in this chapter are stated in explicit 
forms. As an example, the asymptotics of a linear recurrence is derived easily 
from the characteristic polynomial and the initial conditions, as was shown in 
section 9.1. One needs to compute the roots of the characteristic polynomial, and 
that is precisely what computer systems do well. It is therefore possible to write 
programs that will derive the asymptotics behavior from the specification of the 
recurrence. More generally, one can analyze asymptotics of a much greater variety 
of generating functions. Flajolet, Salvy, and Zimmermann (Fiajolet 1992, Flajolet 
et al. 1991b) have written a powerful program for just such computations. Their 
system uses Maple to carry out most of the basic analytic computations. It contains 
a remarkable amount of automated expertise in recognizing generating functions, 
computing their singularities, and extracting asymptotic information about their 
coefficients. For example, if 


f(z) = — log[1 + zlog(1 — z7)} + (1 — 23)? + exp(ze*) , (17.1) 


then the Flajolet-Salvy-Zimmermann system can determine that the singularity of 


f(z) that is closest to the origin is at z = p, where p is the smallest positive root 
of 


1 = —plog(1 — p*) , (17.2) 


and then can deduce that 
[2"f(z) =n'p"+O(n 2p") asn oo. (17.3) 


The Flajolet-Salvy-Zimmermann system is even more powerful than indicated 
above, since it does not always require an explicit presentation of the generating 
function. Instead, often it can accept a formal description of an algorithm or data 
structure, derive the generating function from that, and then obtain the desired 
asymptotic information. For example, it can show that the average path length in 
a general planar tree with n nodes is 


sain? + an +O(n'/?) asn—> co. (17.4) 


What makes systems such as that of Flajolet et al. (1991b) possible is the phe- 
nomenon, already mentioned in section 6, that many common combinatorial op- 
erations on sets, such as unions and permutations, correspond in natural ways to 
operations on generating functions. 

Further work extending that of Flajolet et al. (1991b) is undoubtedly going to 
be carried out. There are some basic limitations coming from the undecidability 
of even simple problems of arithmetic, which are already known to impose a 
limitation on the theories of indefinite integration. If we approximate a sum by an 
integral 


b 
/ x" dx, (17.5) 
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then as a next step we need to decide whether a =1 or not, since if a= 
1, this integral is log(b/a) (assuming 0 < a <b < 00), whereas if a #1, it is 
(b'-* — a'-*)/(1 — a). Deciding whether a = 1 or not, when a is given implic- 
itly or by complicated expressions, can be arbitrarily complicated. However, such 
difficulties are infrequent, and so one can expect substantial increase in the appli- 
cability of automated systems for asymptotic analysis. 

The question of decidability of asymptotic problems and generic properties of 
combinatorial structures that can be specified in various logical frameworks has 
been treated by Compton (1987, 1988, 1989). There is the beautiful recent the- 
ory of 0-1 laws for random graphs, which says that certain (so-called first-order) 
properties are true with probability either 0 or 1 for random graphs. Compton 
proves that certain classes of asymptotic theories also have 0-1 laws, and describes 
general properties that have to hold for almost all random structures in certain 
classes. His analysis uses Tauberian theorems and Hayman admissibility to deter- 


mine asymptotic behavior. For some further developments in this area, sce also 
Bender et al. (1992). 


18. Guide to the literature 


This section presents additional sources of information on asymptotic methods 
in enumeration and analysis of algorithms. It is not meant to be exhaustive, but 
is intended to be used as a guide in searching for methods and results. Many 
references have been presented already throughout this chapter. Here we describe 
only books that cover large areas relevant to our subject. 

An excellent introduction to the basic asymptotic techniques is given in Graham 
et al. (1989). That book, intended to be an undergraduate textbook, is much more 
detailed than this chapter, and assumes no knowledge of asymptotics, but covers 
fewer methods. A less comprehensive and less elementary book that is oriented to- 
wards analysis of algorithms, but provides a good introduction to many asymptotic 
enumeration methods, is Greene and Knuth (1982). 

The best source from which to learn the basics of morc advanced methods, 
including many of those covered in this chapter, is de Bruijn (1958). It was not 
intended particularly for those interested in asymptotic enumeration, but almost all 
the methods in it are relevant. De Bruijn’s volume is extremely clear, and provides 
insight into why and how various methods work. 

General presentations of asymptotic methods, although usually with emphasis 
on applications to applied mathematics (differential equations, special functions, 
and so on) are available in the books Bleistein and Handelsman (1975), Erdélyi 
(1956), Fedoryuk (1987, 1989a), Olver (1974), Sirovich (1971), Szegd (1959), Wa- 
sow (1965), Wimp (1984), and Wong (1989). Integral transforms are treated exten- 
sively in Davies (1978), Doetsch (1955), Fedoryuk (1989b), Oberhettinger (1974), 
and Titchmarsh (1948). Books that deal with asymptotics arising in the analysis 
of algorithms or probabilistic methods include Alon and Spencer (1992), Bollobds 
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(1985), Erdés and Spencer (1974), Hofri (1987), Kemp (1984), Kolchin (1986), 
Kolchin et al. (1978), Mahmoud (1992), and Sachkov (1978). 

Nice general introductions to combinatorial identities, generating functions, and 
related topics are presented in Comtet (1974), Stanley (1986), and Wilf (1990). . 
Further material can be found in Andrews (1976), David and Barton (1962), Ego- 
rychev (1984), Goulden and Jackson (1983), Harary and Palmer (1973), Riordan 
(1958, 1968). , 

A very useful book is the compilation, Gonnet and Baeza—Yates (1991). While it 
does not discuss methods in too much detail, it lists a wide variety of enumerative 
results on algorithms and data structures, and gives references where the proofs 
can be found. 

Last, but not least in our listing, is the three-volume work (Knuth 1973a,b, 1981). 
While it is devoted primarily to analysis of algorithms, it contains an enormous 
amount of material on combinatorics, especially asymptotic enumeration. 
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Introduction 


In extremal graph theory one explores in the relations between various graph 
invariants like order, size, connectivity, chromatic number, diameter, radius, 
clique number, minimal and maximal degrees, the circumference, the genus. 
More generally, one is interested in the values of these invariants ensuring that a 
graph having a certain property has another given property as well. Let us give 
two examples. Given a graph F, determine ex(n; F), the maximal number of 
edges in a graph of order n that does not contain F, the forbidden graph, as a 
subgraph. Given two properties of graphs, A and 9, say, a number of graph 
invariants f,,...,f,, and a natural number n, determine the set A(n)= 
{(a,,...,a,): if a graph G of order n with f(G) =a,,i=1,...,k, has property P 
then it also has property 2}. 

The first of these is the classical extremal problem which, though important, is 
rather narrow; the second problem, on the other hand, is perhaps too broad a 
problem to be rightly claimed as a genuine extremal problem, since most 
problems in graph theory could be formulated in this way. In practice, one stays 
away from both extremes by considering a problem in graph theory to be an 
extremal problem if its “natural” formulation asks for some best possible 
inequalities among various graph invariants. However, in this chapter we shall 
take a rather narrow view of extremal problems, mostly for lack of space and also 
because several problems belonging to extremal graph theory are considered in 
other chapters of this volume, in chapters on Ramsey theory, Hamilton cycles, 
colouring, connectivity, matching, etc. 

In a typical extremal problem, given a property # and an invariant @ for a class 
G of graphs, we wish to determine the least value f for which every graph G in G 
with @(G) >f has property ¥. The graphs in & without property Y and satisfying 
o(G) =f are the extremal graphs for the problem. More often than not, G 
consists of graphs of the same order n, namely = {G € #: |G| =n}, where # is 
a class of graphs, and so f is considered to be a function of n, determined by ¢ 
and #. This function f(n) is the extremal function for the problem. 

A short review like this is easily overcrowded with a host of results. In order to 
avoid this, in section 1 we shall study the classical extremal problem, the problem 
of forbidden subgraphs, at a leisurely pace, giving some of the simpler proofs. 
The other sections are considerably shorter and are intended to provide the 
reader with only glimpses of the topics. Our aim is to give the flavour of the 
subject rather than overwhelm the reader with results. This review is based mostly 
on Bollobds (1978a) and an update of that book, Bollobas (1986). 


1. Forbidden subgraphs 


Let ¥={F,,...,F,} be a family of graphs of order at most n: the family of 
forbidden graphs. Write ex(n; #) = ex(n; F,,..., F,) for the maximal size of a 
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graph of order n containing no forbidden graph F,, i.e., containing no subgraph 
isomorphic to a forbidden graph F,. In this section we shall take ¥ to be a fixed 
family, independent of n, and we are mostly interested in the asymptotic value of 
ex(n; ¥) as n> ~, 


1.1. Turan’s theorem and its extensions 


One of the earliest substantial theorems in graph theory is due to Turan (1941) 
and it concerns the function ex(n; K,), where K, is the complete graph of order r. 
Turan’s theorem was not only the starting point of extremal graph theory but it 
also signalled the birth of graph theory as an active subject. Although Mantel 
(1907) proved that ex(n; K,) = [n’/4], Turan was the first to study ex(n; K,) for 
all r. 

Given 1<s<n, denote by 7,(n) the complete s-partite graph with |n/s}, 
{(a + 1)/s],..., \@2+s—1)/s] vertices in the various classes. Thus 7,(n) is the 
unique complete s-partite graph of order » whose classes are as equal as possible. 
Equivalently, it is also the unique s-partite graph of order n whose size is as large 
as possible. The graph 7,(m) is the s-partite Turan graph of order n. Denote the 
size, i.e., the number of edges, of T,(n) by ¢,(n): 


s 


utm= (2) -3 (”) 7 als wai |* tat 


where n,; = |(2 +i—1)/s} is the number of vertices in the ith smallest class. In 
particular, t,(n) = [n?/4]. 

An (r —1)-partite graph does not contain a K,; in particular, T,_,(”) does not 
contain a K,. Consequently, ex(a; K,)=t,_,(n). Turan (1941) (see also Turan 
1954) proved that, in fact, we have equality, and 7,_,(7) is the only extremal 
graph. 


Theorem 1.1.1. Let r22. Then ex(n; K,)=t,_,(n) and T,_,(n) is the only 
extremal graph: it is the only graph of order n and size t, ,(n) that contains no 
complete graph of order r. 


Proof. The graph T,_,(”) is a maximal K,-free graph: it contains no K, and if we 
join two vertices belonging to the same class of 7, ,(m) then these two vertices, 
together with r — 2 vertices, one from each of the other classes, form a K,. Hence 
it suffices to prove the second assertion: if G has order n, size t,_,(1), and 
contains no K,, then G is (isomorphic to) 7,_,(7). 
The structure of T,_,(7) is ideal for proving this by induction on n. Indeed, 
given that we have ¢, ,(#) edges, the vertices in T, (nm) have as equal degrees as 
- possible: the minimal degree is 6,_,(n)= [2¢,_,(n)/n] =n—(l(ntr-—2)/ 
(r—1)] =n — [n/(r — 1)] and the maximal degree is 4,. ,() = [2t,_,(2)/2] =n — 
{n/(r —1)|. Furthermore, if we delete a vertex x of minimal degree from T,_ ,(7) 
then we obtain 7,_,(n — 1). In particular, t, .,(a) — 6,_,(@2) =¢,_,(” — 1). Finally, 


Extremal graph theory 


as 6,_,(n)=n-—1- |(2—-1)/(r—-1)], the vertex x is joined to all 
T,.,(2 — 1) except to the vertices in a smallest class. 

Let us see then the proof by induction on n. For n <r -— 1 there is n 
prove so let us assume that n =r and the assertion holds for smaller values of if. 
Let G be a graph of order n and size t,_,(n) that does not contain a K,. Let x EG 
be a vertex of minimal degree: d(x) = 8(G) < |2e(G)/n} = (21,_,(n)/n] =8,_,(n). 
Set H = G —x. Then e(H) = e(G) — d(x) =t,_,(n) — 6,_,(n) =1t,_,(n — 1). Since H 
contains no K,, by the induction hypothesis H is T,_,(n — 1) and d(x) =6,_,(n). 
The vertex x cannot be joined to r — 1 vertices in distinct classes of H = T,_,(n — 
1) because then these r vertices would form a K,. Consequently T,.,(n — 1) has a 
class, no vertex of which is joined to x. But then this has to be a smallest class and 
x has to be joined to all the vertices in all the other classes. Therefore G is 
precisely T,.,(2). O 


The proof above is not so much about graphs not containing a complete graph 
of order r but about the unusual ease with which T,_,(”) can be produced from 
T,_,(2 — 1). Let us give a slightly different slant to the proof of the induction step 
above. Since the degrees of the vertices of T,_,(m) are as equal as possible, given 
the number of edges, and since e(G)=t,_,(n), there is a vertex x in G with 
d(x) <6,_,(n) = 8(T,_,(”)). Then, by the induction hypothesis, H = G—x must 
be T__,(2 — 1) and d(x) = 6,__ ,(n). If the vertices not joined to x form a (smallest) 
class of T,_,(m — 1) then we are done. Otherwise pick a vertex y in T,_,(n — 1) 
which is not joined to x. Then y has degree 6,.,(n) in G so, by the induction 
hypothesis, G — y is also T,_,( — 1). But that is clearly not the case because, for 
example, G — y contains a K,. 

This version of the proof of the induction step implies the following extension 
of Theorem 1.1.1. 


Theorem 1.1.2. Let F,,..., F, be graphs of order at most t, and let s be such that 
no T,(n) contains any of the F,. Suppose n,=t is such that ex(ng; F,,.-., F,) = 
t,(a9) and T,(n,) is the only extremal graph. Then the same assertion holds for 
every n= ny: ex(n; F,,...,F,) =t(n) and Tn) is the only extremal graph. 


If we do not care about the uniqueness of the extremal graph 7,_,(n) in 
Theorem 1.1.1, then all we need for the proof is that every graph of order 
n2=r+1 and size t,_,(n) +1 has minimal degree at most 5_ ,(m). This observation 
shows that if G is a graph of order n and size ¢,_,() +1 then for every n’, 
r+1<n'<n, the graph G contains a subgraph of order n’ and size at least 
t,_,(n')+1. In particular, as shown by Dirac (1963), every graph of order 
n2r+1 and size t,_,(”) +1 contains not only a K, but also a K,,,, a complete 
graph of order r + 1 from which an edge has been deleted. 

This observation can be carried over to greater excess size over t,(”). A graph 
G of order n = (2g — 1)s + 2 and size t,(n) + q has minimal degree at most 6,(n), 
so G has a subgraph of order n—1 and size t,(n ~1)+q. This implies the 
following result. 


\ 
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Theorem 1.1.3. Let s 22, q=1, n,2(2q —-'!)s +2 and let F,,..., F, be graphs 
such that ex(ny; F,,...,F,) <t,(ny) + ¢. Then ex(n; F,,...,F,) <t,(n) + q for all 


n=Np. 


Let us return to Turdn’s Theorem 1.1.1. This result claims that the size of a 
graph G of order n not containing a K, is dominated by the size of an 
(r — 1)-partite graph H of order n. Erdos (1970) proved that we can guarantee 
that this domination holds at every vertex: the edges of G can be rearranged and, 
perhaps, some more edges can be added to the graph in such a way that the 
resulting graph H is (r ~ 1)-partite and every vertex is incident with at least as 
many edges in H as in G. As so often in mathematics (especially in com- 
binatorics), the achievement is the discovery of this beautiful fact: the proof is 
straightforward. 


Theorem 1.1.4. Let G be a graph not containing a K,, r22. Then there is an 
(r — 1)-partite graph H with vertex set V(H) = V(G) = Vsuch that d(x) < d,,(x) for 
every x EV. Furthermore, H can be chosen to satisfy e(G)<e(H), i.e., d(x) < 
d,,(x) for at least one vertex x, unless G is a complete (r —1)-partite graph with 
r—1 non-empty classes. 


Proof. We apply induction on r. The assertion is obvious for r = 2, so we pass to 
the induction step. Suppose r > 2 and the assertion holds for smaller values of r. 
Let v EV be a vertex of maximal degree in G: d,(z) = A(G), and let W= I'(v) be 
the set of neighbours of v. Then G = G[W], the graph induced by W, does not 
contain a K,_,. Hence, by the induction hypothesis, there is an (r ~ 2)-partite 
graph Af with vertex set W such that d¢(w) <d,(w) for every w € W. 

Let us construct an (r — |)-partite graph /7 with vertex set V from Af by joining 
all vertices in V\W to all vertices in W. It is easily seen that d,,(x) <d,,(x) for 
every x EV. Furthermore, it is easily seen that if G is a complete (r — 2)-partite 
graph and d(x) = A(G) for every x EV\W then G is a complete (r — 1)-partite 
graph. 


Since T,_,() is the unique (r — 1)-partite graph of order n and maximal size, 
Theorem 1.1.4 implies Theorem 1.1.1. 

Let us say a few words about a natural extension of the function ex(n; ¥). For 
a graph G and a family ¥ of graphs, let ex(G; ¥) be the maximal number of 
edges in a subgraph of G that contains no element of ¥ as a subgraph. Thus, 
ex(n; F) =ex(K,; ¥). It would be unreasonable to expect precise results about 
the function ex(G;#) or even ex(G; K") but, somewhat surprisingly, sharp 
results can be obtained in the case when G is a random graph G,, , (see Bollobas 
1985, and chapter 6). Among other results, Babai et al. (1990) proved that, for a 
fixed value of p, with probability tending to 1, ex(G,,,;K’) is the maximal number 
of edges in an (r~ 1)-partite subgraph of G,,. They also conjectured the 
following result which was proved, a little later, by Frankl and Pach (1988). 


A ———K 
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Let us say that a graph has property P(k,{) if any k vertices have at most { 
common neighbours. 


Theorem 1.1.5. Let t, r 22 be fixed integers, and let0<c=1—1/(r—1). Let G 
be a K,-free graph with n vertices, having property P(t,cn).. Then 


1 tit 
e(G) <e"(1 - 7) - n?/2+0(n’). 
As an casy consequence of this result, one finds that ex(G 


r) 
p(i- 1/(r — 1))n?/2 + o(n’) with probability tending to 1. 
y 


np? 


1.2. The number of complete subgraphs 


We know from Turan’s theorem that a graph of order greater than ¢,_,(7) 
contains at least one K,, and we know also that it has to contain at least two K,. 
Let us go further: given m>1t,_,(n), at least how many K, are in a graph of order 
n and size m? Even more, if we know that a graph of order m has many K, 
subgraphs, what can we say about the minimal number of K, subgraphs it has to 
contain? 

To formulate this problem precisely, let us introduce some notation. Denote by 
k,(G) the number of K, in a graph G. Thus k,(G) is just the size of G, the 
number of edges of G, and Turan’s theorem tells us that if G has order n and 
k,(G)>t,_,(n) then k,(G)21. For natural numbers 2<p<r<n and a real 
numbcr x = 0 define 


k (ky =x) = min{k,(G"): G" is a graph of order n and k,(G")> x}. 


What can we say about the function k,(k; 2 x)? As shown by Bollobas (1976a), 
this function is also closely connected with the Turan graphs 7,(n), 73(n),.... 
For simplicity, let us suppress the variable n and put T, = T7,(n). The graph 

=T,_, contains no K, but it has k,(T,_,) complete graphs of order p, so 
k(ki =x) =0 for 05x =k,(T,_,). 

Let (x) be the maximal convex function defined on the interval &,(7,_,) =x < 
(7) such that 


w(K, (T,)) =k, (T,) mes) 
forg=r—I,r,...,n. [tis casily scen that, in fact, equality holds in (1) for every 
q. Also, the Turan graph 7, shows that for x =k,(T,) we have 

k,(k, 2x) < (x) - (2) 


It turns out that (x) is actually a lower bound for k,(k> 2 x) for all values of x. 


Theorem 1.2.1. Let 2<p<r<n. For k,(T,_,)<x <(j) we have 


k, (ky, > x) > W(x) . 
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‘In particular, if a graph of order n has at least as many K, subgraphs as T,(n) then 
it also has at least as many K, subgraphs as T,(n). Also, if a graph of order n has 


more K,, subgraphs than T,_,(n) then it contains a K,. 


The last assertion above was first proved by Erdés (1962) and it was 
rediscovered by Sauer (1971). 
Let us state a weaker but more transparent version of Theorem 1.2.1. The 


bound on the number of triangles given below was conjectured by Nordhaus and 
Stewart (1963). 


Theorem 1.2.2. (i) Let n’?/4<m <n’/3. Then every graph of order n and size m 
contains at least n(4m — n’)/9 triangles. 

(ii) Every graph of order n and size m contains at least n'~*(2(r —1)m— 
(r—2)n’)ir’~' copies of K,. 


The bound above on the minimal number of triangles is fairly good: it is 
certainly best possible for n = 3n, and m = n’/3 = 3n{. However, when m is not 
much greater than ¢,(n) = [n’/4] then the estimate is rather crude. How can we 
construct a graph of order n and size m = [n?/4] + 1 which contains few triangles? 
For | <n/2 we can join a vertex in a larger class of T,(n) to / vertices of the same 
class to obtain a graph containing precisely /|n/2] triangles. Erddés (1962) 
conjectured that we can never do better and proved that this is indeed the case if 
f<en for some c>0. This conjecture was proved by Lovasz and Simonovits 
(1976, 1983), who also proved a number of results concerning &,(k> =x), the 
minimal number of complete r-graphs in a graph of order 7, with at least x edges. 


Theorem 1.2.3. For 0<I<n/2, a graph with n vertices and t,(n) +1 edges 
contains at least ||n/2| triangles. 


There are a good many results concerning the covering of graphs by complete 
subgraphs. The first result in this area was proved by Erdés et al. (1966b); this 
was sharpened by Bollobas (1976a), Chung (1981) and Gyéri and Kostochka 
(1979). The following result was conjectured by Erdés and proved by Pyber 
(1986). 


Theorem 1.2.4. Let G be a graph with n vertices. Then G and its complement can 
be covered with at most |n’/4| +2 complete subgraphs. The graph T,(n) shows 
that this bound is best possible. 


A considerable extension of the original theorem of Erdés et al. was conjec- 
tured by Winkler, and proved by McGuinness (1994). 


Theorem 1.2.5. If maximal cliques are removed one by one from a graph with n 
vertices, then the graph will be empty after at most n’/4 steps. 


In fact, Winkler made a stronger conjecture as well, which is still open: if 


< 
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maximal cliques are removed one by one from a graph with n vertices, then the 
graph will be empty after the sum of the number of vertices in the cliques has 
reached n?/2. 


1.3. Complete bipartite graphs 


Let us turn to the analogue of the Turan problem for bipartite graphs. Given 
natural numbers m, n, s and t, what-is the maximal size of an m by n bipartite 
graph not containing a K(s,t), a complete s. by ¢ bipartite graph? Denote this 
maximum by z(m, n;5,¢) and put z(n; ¢) = z(n, n; 1, t). Zarankiewicz (1951) asked 
this question for s =t=3 and m=n=4, 5, 6 and the general problem has also 
become known as the problem of Zarankiewicz. The similarity with Turdn’s 
problem is, unfortunately, only superficial: for the general function z(m, n;s, t) 
there is no beautiful extremal graph and we are far from being able to determine 
even the order of z(n;/) for a fixed (but large) value of t. 

It is worth reformulating the Zarankiewicz problem in terms of 0-1 matrices. 
At most how many Is can a 0-1 matrix of m rows and n columns contain if it has 
no s by ¢ submatrix all whose entries are 1s? 

The following rather trivial lemma is just about the most one can say about the 
general function z(m,n;5s,¢). As, trivially, zm, 1; 1, 1) =m(t—1) for 1<t<n, it 
is sufficient to consider the case 25s<m,2<t<n. 


Lemma 1.3.1. Let m,n,s,t, rand k be integers, 2<s<m,2<t<n,0srxm, 
and let G be an m by n bipartite graph of size z =my = km +r without a K(s, t). 


Then 
nl?) <en-aft) eel!) <0-o(). r 


Proof. Let (V,,V,) be the bipartition of G and let V, = {x,,...,x,,}, Ux;,) = d,. 
Let us call a set {xy,, xy,,..., xy,} of t edges of G incident with the same vertex x 
a claw; furthermore, x is the centre of the claw and the t-set {y,,..., y,} is the 
base. 

The graph G has Ay (4') claws since there are (“') claws with centre x,. On 
the other hand, each f-subset of V, is the base of at most s—1 claws since G 
contains no K(s,¢). Therefore G has at most (s — 1)(”) claws and so 


> (()<6-n(7). @) 


Since Bi d,=z=km+r,0<r<m, and (") is a convex function of u for u=t, 
inequality (2) implies (1). O 


Theorem 1.3.2. Let m,n, s, t be natural numbers, 2<s<m,2<t<n. Then 


z(m,n;s,t)<(s—1)'"(n—t4+ 1)m'"" 4+ (t-1)m. 
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Proof. Let G be an extremal graph for z(m, n; s, t). Set y = z(m, n; 5s, t)/m. Then, 
since y<n, by Lemma 1! we have 


my~(t-1I)'<@-Da@-@-D!. A 
For a fixed value of = 2, Theorem 1.3.2 implies that 
2(n3t) <(t—1)'"n? "+ Of) (3) 


and it is conjectured that (3) is essentially best possible. To be precise, it is 
conjectured that 


lim z(n; tin? '"=¢,>0 (4) 
for every ¢ 22. So far, the only value of ¢ for which (4) is known to hold is f= 2. 
In fact, K6vari et al. (1954) and Reiman (1958) determined z(n; 2) for infinitely 
many values of n, but there is no ¢=3 for which z(n;¢f) is known for infinitely 
many values of n. 


Theorem 1.3.3. (i) z(a;2) <(n/2){1 + V4n — 3} for all n 22. 
(ii) Let q be a prime power and let n= q? +q+1. Then 


e(n;2) = 5 (1+ Van 3} = (q~ Yq? + g41). 


(iii) lim,_,.. z(n32)/n?? = 1. 
Proof. (i) Let G be an extremal graph for z(m; 2) and let the notation be as in the 
proof of Lemma 1.3.1. By inequality (2), 


n n a 2 a 

n’—n> 2, d>- d,=( d) In- > d,=2in~z. 
i=] i=l i=1 i=1 

This implies the required inequality. 

(ii) From the proof of part (i) we see that equality holds in (i) if and only if (1) 
every vertex in G has the same degree d, (2) for every two vertices in V, there is 
precisely one vertex in V, joined to both, and (3) for every two vertices in V, 
there is precisely one vertex in V, joined to both. This means that the graph G can 
be considered as a finite projective plane: V, is the set of points, V, is the set of 
lines and x EV, is joined to y € V, iff the point x is incident with the line y. Now if 
q is a prime power then there is a projective plane of order gq, that is with 
n=q’ +q +1 points and lines. 

(iii) Since for every sufficiently large natural number n, there is a prime 
between n —n?’*/10 and n, the assertion follows from (i) and (ii). O 


* 
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A somewhat weaker form of conjecture (4) is that lima; z(n; t)/n?"'">0. In 
addition to ¢=2, this is known for t=3. Brown (1966) proved that 
lim- 2(n;3)/n?"'? =1 by making use of the 3-dimensional affine space 
AG(3, p) over the finite field of order p. However, for a general t 2 4 all we know 
is that : 


fim z(n; t)/in? OVS —(ay?, (5) 


This is proved by making use of random graphs (sce Bollobds 1979, p. 127). The 
gap between the upper bound, n?-‘", and the lower bound, n?~7/"*")) is 
alarmingly large; as stated above, it is very likely that the upper bound gives the 
correct value. 

The functions ex(m; K(s,4)) and z(n,n;s,¢) are intimately connected; in 
particular, for fixed values of s and # they have the same order. It is casily seen 
that 


2 ex(n; K(s, t)) = z(n, 0; 5, 1) Sex(2n; K(s,0)). (6) 


Indeed, given a graph G of order n and size m = ex(n; K(s, )), construct an n by 
n bipartite graph H as follows. Take two disjoint copies of V(G), say V, and V,, 
and join x’ EV, to y” EV, iff xy € E(G), where x and y are the vertices in V(G) 
corresponding to x' and y". Then H has 2m edges and contains no K(s, t) (and no 
K(t,s), for that matter) so the first inequality in (6) holds. The second inequality 
is trivial. 

Combining inequality (6) with Theorem 1.3.2, and noting the analogue of (5), 
we have the following assertion. 


Theorem 1.3.4. [f 2=s<n then 
5(1~(s!) ?)n? "0" <ex(n; K(s,s)) 
<1(s-1)'°(n—st+ 1a! +4(s-1)a 
2-Us s— 1 
goa es n. 

As (6) holds and we do not know the order of z(n, n; ¢, ¢) for 124, neither do 
we know the order of ex(n; K(s,s)) for s2=4. However, we do know that 
ex(n; K(2,2)) has order n°’? and ex(n; K(3,3)) has order n°”. In the case of 
K(2,2) we can do considerably better. As in the problem of determining 
ex(n; K(2, 2)) we do not care where the classes of K(2, 2) are, it is more natural 
to write C, instead of K(2,2), indicating that K(2,2) is just a 4-cycle or 
quadrilateral. 

Inequality (5) and Theorem 1.3.3 (ii) imply that 


<n 


ex(n; C,) <4 {1+ Van = 3) . (7) 


Erdos et al. (1966a) noticed that certain graphs constructed by Erdés and Rényi 
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(1962) show that (6) is asymptotically best possible. The same assertion was 
proved independently by Brown (1966). 


Theorem 1.3.5. Let q be a prime power. Then 


1 Petar? cye! 2,441 

24(q + 1) <ex(q’ +4 4+1,C,)<3q(q +1) +7. (8) 
Furthermore, 

lim ex(n; C,)/n?? =}. (9) 


Proof. The second inequality is preciscly inequality (6) for n = q? +q +1. Let us 
prove the first inequality by describing the graph G, constructed by Erdés and 
Rényi (1962). 

The vertex set V(G,) is the set of gq’ + q + 1 points of the finite projective plane 
PG(2, q) over the finite field of order g. A point is joined to all the points on its 
polar with respect to the conic x? + y? +z? =0. Thus two points (a, b,c) and 
(a, B, y) are joined iff aa + bB + cy =0. Then a point not on the conic is joined 
to q + 1 points, i.e., to all the lines on its polar, while each of the g + 1 points on 
the conic is joined to q points, namely to the points on its potar except itself. 
Hence G, has 4{q°(q+1)+(q+1)q} =49(q + 1) edges. 

The graph G, does not contain a quadrilateral since any two lines mect in 
exactly one point so every vertex is determined by any two of its neighbours. 

Relation (9) follows as Theorem 1.3.3 (iii). O 


The bounds in (7) are tantalizingly close. The only reason why the graph G, is 
not ideal for the problem is that it has absolute points, i.e., points lying on their 
polars. These g + 1 points are joined to only q points, instead of g + 1, as all the 
others. If we could avoid these absolute points by choosing a more suitable 
polarity then we would achieve the upper bound in (7). However, this is not to 
be: Baer (1946) proved that every polarity of a finite projective plane of order q 
has at least q +1 absolute points. Thus the Erdés—Rényi graph G, cannot be 
made to have more edges by choosing a different polarity. 

In view of this fact it is not too surprising that the way to improve (8) is to 
reduce the upper bound. This was achieved by Fiiredi (1983) (see also the 


remarks at the end of that paper) who thereby determined ex(; C,) for infinitely 
many values of n. 


Theorem 1.3.6. For every natural number q we have 
ex(q’+q+1;C,)<49q(q +1). 
In particular, if q is a prime power then 


ex(q’ +qt1;C,)=4q(q+1). 
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What happens if we forbid not only C, but C, as well? The projective 
graph in Theorem 1.3.3 (ii) contains no C,, and as it is bipartite, it contains nv U,——wme 
either. Hence if n =2(q* + q +1) for some prime power q then ex(n; C,, C,) = 
(q-1)(q’>+4q+1), so ex(n; C,, C5) =(n/2)°? + o(n?”) for all n. Erdés and 
Simonovits (1982) proved that this inequality is, in fact, an equality. 


Theorem 1.3.7. ex(n; C,, C,) = (n/2)?” + o(n®”). 


It would be of interest to decide whether ex(n; C,, C;)=(q ~1)(q’? + q@ + 1) if 
q is a prime power and n = 2(q° +q + 1). 


1.4. The fundamental theorem of extremal graph theory 


For r= 3, the Turan graph 7, _,(n) has ¢,. ,(n) = (r — 2/2(r — 1))n? + O(n) edges 
and contains no K,. On the other hand, every graph of order n and size 
t,.,(n) + 1 has a K,, in fact, several K,. Furthermore, Theorem 1.2.2 implies that 
if 0<¢ <1/2r(r — 1) then every graph of order n and size ((r — 2)/2(r ~ 1) + &)n? 
contains at least (2(r—1)e/r’')n" copies of K,. Thus there is a sudden jump 
when the size reaches t,_ ,(n). 

Although this sudden jump is quite startling, Erdés and Stone (1946) proved 
that a considerably more important change takes place when the size becomes 
significantly greater than ¢,_,(m). This result, which deserves to be called the 
fundamental theorem of extremal graph theory, states that for every r23 and 
e > 0, there is a function s = s(n) such that s(n) ~ as n->™, and every graph of 
order n and size (((r—2)/2(r—1)) + €)n’ contains a K,(s) = K(s,s,...,8) = 
T,(rs), a complete r-partite graph with s vertices in each of the classes. Thus we 
not only get a complete r-partite graph with one vertex in cach class, as claimed 
by Turan’s theorem, but we can guarantee even a complete r-partite graph with 
s(n) vertices in each class, whete s(n) ™ as n> &, 

The assertion above does make sense for r= 2 as well although in that case 
Turdn’s theorem is completely trivial: every graph of order n and size at least en’, 
0<e<4, contains a complete bipartite graph with at least s(n) vertices in each 
class, where s(n)—> © as n—> ©, This assertion is immediate from Theorem 1.3.4: 
if 0<e<} and 0<c<log1/2e are fixed then the assertion is true with s(n) = 
[clogn], provided n is sufficiently large. 

Let us state then the fundamental theorem of extremal graph theory, proved by 
Erdés and Stone (1946). 


Theorem 1.4.1. Let r=2 and «>0 be fixed. Then there is a function s = s(n), 
with tim, ..s(n)=™%, such that every graph of order n and size at least 
((r —2)/2(r — 1) + €)n” contains a K,(s). 


As we are interested in the growth of s(n), Ict us introduce the following 
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notation. For r=2 and 0< ¢ <1/2(r-- 1) define 
S$, .(n) = min| every graph of order 7 and size at least 


ey + e)n contains a K,(} ; 

Erdés and Stone (1946) proved that s, .(1) = (i._,(n))'” if n is sufficiently large, 
where /, ,(11) is the r— 1 times iterated logarithm of n. Furthermore, Erd6és and 
Stone conjectured that the order of s, (mn) is J, ,(n). Later Erd6és (1967) 
announced that s, (1) > c(log n)''°"" for some constant c>0 and sufficiently 
large n. 

Rather unexpectedly, s, (1) turns out to be much larger than these lower 
bounds. The true order of s, .(m) was determined by Bollobas and Erdés (1973). 


Theorem 1.4.2. Let r=2 and 0< e<1/2(r—1). Then there are positive constants 
c,=C,(r, ©) and c, =c,(r, ©) such that 


c, logn<s, .(n)<c,logn. (1) 


In particular, every graph of order n and size at least ((r —2)/2(r — 1) + e)n’ 
contains a complete r-patite graph with at least c, logn vertices in each class. 


How do c, and c, depend on r and €? As pointed out by Bollobas and Erdés 
(1973), the constant c, can be chosen to be 5/log(1/e), provided n is sufficiently 
large. This can be seen by a simple application of random graphs. What about c,? 
Improving inequality (1), Bollobés et al. (1976) proved that one can take 
c, =c/rlog(1/e) for some absolute constant c > 0, provided 7 is sufficiently large. 


Finally, Chvatal and Szemerédi (1981) showed that this is true without the factor 
r. 


Theorem 1.4.3. There is an absolute constant c >0 such that 


c 5 
jog(I/e) (08 * <5:.2€") < font yey 08" 
if r=2,0<e<1/2(r—1) and n is sufficiently large. 


First we shall sketch a proof of Theorem 1.4.2 and then we shall return to 
Theorem 1.4.3. As we remarked above, the upper bound in (1) is very easy: it 
follows from a straightforward application of random graphs. To prove the lower 
bound, we shall need the following lemma. 


Lemma 1.4.4. Let G be a graph of order n that contains no K, , ,(s) but contains a 
K,(q), say K. Then G has at most 


(r= 1)q + s\n + 2gn'"" 
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edges joining K to G — K. 


Proof. As in the proof of Lemma 1.3.1, we define a claw with centre x EG — Kas 
the set of r edges incident with x such that precisely s of these edges join x to each 
of the r classes of K. It is casily checked that if x € G — K is joined to (r — 1)q +d 
vertices in K then there are at least (7)" '(¢) claws with centre x. Hence if there 
are (r ~1)qgn + D>(r—1)gn +sn edges joining G~K to K then there are at 
least n(7)""'(°/") claws in G. : 

Since G contains no K,, ,(s), there are at most s — 1 claws with the same base, 
the same set of vertices joined to the centre. As there are (4)’ possible bases, G 
contains at most (s — 1)(7)’ claws. Consequently, 


nl") =6- a3). 


Dn ''(s5 = 1)'°q<2n''"q : 


Hence 


proving the lemma. 0 


Armed with this lemma, we shall prove the main part of Theorem 1.4.2, the 
lower bound on s, .(n). To be precise, we shall prove the following assertion. 


Theorem 1.4.2’. Let r2=2, 0<e<1/2(r-1) and 0<y,<(r— lle” '/og(8/e). 
Then if n is sufficiently large, every graph of order n and size at least 


contains a K,(s) where s = |y, logn]. 


Proof. Let us add to Theorem 1.4.2 a trivial assertion concerning the case r= 1: 
for ¢ >0, every graph of sufficiently large order contains a K,(s) for s = ly, log n] 
where y, = 2/e. 

Suppose then that the result is true for r=1 but fails for r +1: there is a 
constant y/,,,0<y/,,<rte"/log(8/e), such that for every n, there is a graph G, 
of order n, =n, and size at least ((r - 1)/2r+ ¢)n; without a K,,,(5,), where 
$,=ly}.,logm,|. Such a graph G, has average degree ((r — 1)/r + 2e)n, so it 
contains a subgraph G with n>Jen, vertices and minimal degree at least 
((r — 1)/2r + 3e)n. Let y},,<¥,4, <ery, <rte"/log(8/e). Then, if n is sufficiently 
large (and that is the case if ny is sufficiently large), the graph G contains no 
K,,,(s), where s = |y,,, logn]. However, it does contain a K,(q), say K, where 
q = |v, logn]. By Lemma 1.4.4 there are at most ((r — 1)q + s)n + 2qn' '* edges 
joining K to G— K, so some vertex of K has degree at most 


rq + {((r—1)q +s)n+2qn' '"}irg. 
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r—1 
r 


{-Vs 


r-l1 3 
( +5 e)n<a(G)< 


sn 2 
n+rq+—+—ns 
r rq sr 


This inequality cannot hold if # is large cnough since then rq < }en, s/rq < & 
and (2/r)n '*< +e. This contradiction completes the proof. 


The proof Theorem 1.4.3, given by Chvatal and Szemerédi (1981), is based on 
a deep and important lemma due to Szemerédi (1978). This result, to be stated 
below as Theorem 1.4.5 and usually called the uniform density lemma or 
regularity lemma, was one of the main tools in the proof of Szemerédi’s (1975) 
theorem, one of the most difficult results in combinatorics, stating that every 
sequence of integers with positive upper density contains arbitrarily long arith- 
metic progressions. 7 

For a graph G, and disjoint sets U, WCV(G), denote by e(U, W) the number 
of U— W edges. The density of the edges between U and W is 


e(U, W) 
d(U, W) =——_. 
= Tul hw| 
The pair (U, W) is e-uniform or e-regular if 
|d(U', W') — d(U, W)|<e 


whenever U'CU, W'CW, [U'|>e]U] and |[W'| > |W. 


Theorem 1.4.5. Given ¢ >0 and an integer m, there is an M = M(eé, m) such that 
the vertices of every graph of order at least m can be partitioned into classes V,, 
V,,..-,V,, where m<k <M, such that |V,| <|V,|=|V,|=---=|V,| and all but at 
most ek? of the pairs (V,,V,), lsi<j<k, are e-uniform. 


The following two immediate consequences of Theorem 1.4.1 show why the 
result is called the fundamental theorem of extremal graph theory. In the spirit of 
the notation used above, for a graph G and a set U CV(G) define the density 
d(U) of the subgraph G[U] spanned by U as 


d(U) = e(G[U}) i (5) 


where u = |U|. Thus if U spans a complete graph then d(U) = 1, if U consists of 
independent vertices then d(U) = 0. 
Let G be an infinite graph. Define the upper density of G to be 


d(G) = sup{a: for every m>0 there is a finite set U satisfying |U|>m 
and d(U)>a}. 


Putting it another way, if 8 > d(G) then there is an m > 0 such that whenever U 
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has at least m vertices then d(U)<B, and d(G) is the smallest such number. 
Clearly, if G is the empty graph then d(G) = 0, also, if G contains arbitrarily 
large complete graphs then d(G) = {. What are the possible values of the upper 
densities? It is rather natural to expect the closed interval to be the set of possible. 
upper densities. Surprisingly, this is not the case. 


Theorem 1.4.6. The set of upper densities of infinite graphs is (1,0, 5, 4, 3,...}. 


Proof. Suppose d(G) > 1— 1/r + « for some rEN and ¢ >0. Then G contains a 
sequence of subgraphs, say G,, G,,... such that G, has order n; and size at least 
((r—1)/r+3e)(¥)> (Cr -1)/2r + 1 e)n’, and n,— . By Theorem 1.4.1, each G, 
contains a K,,,(s;), where s;—>%. Now d(K,,,(s;))>r/(r +1) and the order of 
K,.,(5,) tends to », so d(G)2r/(r+1). O 


The othér immediate consequence of Theorem 1.4.1 concerns the approximate 
value of ex(n; F,, F,,...,F,). As observed by Erdés and Simonovits (1966), 
Theorem 1.4.1 implies that lim,_,,. ex(a; F,,..., F,)/($) is a very simple function 
of the family {F,,..., F,}. 


Theorem 1.4.7. Let F,,..., F, be fixed non-empty graphs. Set r = min,x(F;) — 1, 
i.e., let r+1 be the smallest chromatic number of an F,. Then 


1 
lim ex(n; F,,....F,)/(5) =1- +. 


Proof. We may assume that y(F,)=r+1. The graph T,(n) contains no F, so 
ex(n; F,,...,F,)=t(n) = (1- 1/r)\(4) +). Hence lim ex(n; F,,..., F,)/(3) 
21-1/r. 

On the other hand, if « >0 and n is sufficiently large, then by Theorem 1.4.1 
every graph G of order n and size at least (1— 1/r+ €)(4) contains a K,,,(s), 
where s >|F,|. But then K,,,(s) contains F, and, therefore, so does G. As this 
holds for every ¢ > 0, we have 


lim i 
fim ex(n; F,,....F)/(5) <1 4. o 


Although Theorem 1.4.7 is just an immediate corollary of Theorem 1.4.1, at 
the first sight it is, nevertheless, very surprising: the crude order of 
ex(n; F,,..., F,) depends only on the minimal chromatic number of the F,. In 
particular, the asymptotic value of ex(n; F,,..., F,) is easily determined if no F, 
is bipartite. Of course, this leaves several questions unanswered. What is the error 
term (n) in ex(n; F,,...,F,) =((r—1)/2r)n’ + b(n)? What is the asymptotic 
value of ex(n; F,,..., F,) when some F, is bipartite? We know from section 1.3 
that we are far from being able to answer the third question for an arbitrary 
family, since we do not even know the asymptotic value of ex(”; K,,), say, but 
we Shall discuss the first two questions in section 1.5. 
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Let us note an easy application of Theorem 1.4.7, giving the rough solution of a 
seemingly intractable problem. 


Theorem 1.4.8. Let ¥ be the family of graphs of order p and size q. Let r= min{s: 
t,.\(p)2q}- Then 
r-1 

2r 


iim ex(n; ¥)in? = 


Proof. Note that min{x(F): FE ¥}=rt+1. O 

To conclude this section, we state a weak form of Theorem 1.4.6, as it leads to 
some deep questions concerning r-graphs, i.e., r-uniform hypergraphs. Given 
r=2 and 0Sa <1], we say that a@ is a jump value for r-graphs if there is a B >a 
such that if ¢ > 0, m =r and n= n(a, €, m) then every r-graph with 1 = n(a, &, m) 
vertices and at least a(”) hyperedges contains a subgraph with m vertices and at 
least B("") hyperedges. Note that @ is a jump value for graphs if for some 6 > 0 
the interval (a, a + 6) contains no upper density of an infinite graph. Hence the 
following result is immediate from cither Theorem 1.4.1 or ‘Pheorem 1.4.6. 


Theorem 1.4.9. Every 0a <1 is a jump value for graphs. 


Erdés posed the problem of deciding whether the same is true for r-graphs. The 


problem was open for several years and was eventually solved by Frank! and Rodl 
(1984). 


Theorem 1.4.10. Let r2=3 and s>2r be natural numbers. Then 1—s'~" is not a 
jump value for r-graphs. 


This beautiful and difficult problem leaves open a number of important 
questions. In particular, it would be interesting to determine the-set of jump 
values for r-graphs and the set of upper densities for r-graphs. 


1.5. The structure of extremal graphs 


For a family ¥={F,,...,F,} of forbidden graphs, denote by EX(n; #) = 
EX(n; F,,..., F,) the set of extremal graphs of order n. Thus a graph G belongs 
to EX(n; ¥) iff G has order n, size ex(n; ¥) and contains no forbidden graph, 
i.e., no member of ¥. Turan’s theorem, Theorem 1.1.1, tells us that EX(n; K,) = 
{T,_,(n)} for all r and n, 2=r<n. For a general family ¥, Theorem 1.4.6, an 
immediate consequence of the Erd6s—Stone theorem, the fundamental theorem 
of extremal graph theory, gives us the rough order of ex(n; #). But what is the 
more precise order of ex(n; #) for a general family # and what do extremal 
graphs look like? 


These questions were answered, surprisingly precisely, by Erdés and 
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Simonovits (1966) and by Simonovits (1968); simpler proofs of the results can be 
found in Bollobas (1978a, pp. 339-345). Here we shall state only of the results. 


Theorem 1.5.1. Let F be a graph with x(F)=r+123, and forn=1,2,... let G” 
be a graph of order n and size (1— \/r + 0(1))($) not containing F. Then the 
following assertions hold. 
(i) There is a K(p,, P2,-+-, P,)s ae Pp; =n, p,=(1 + 0(1))n/r, that can be 
obtained from G" by adding and subtracting o(n’) edges. 
(ii) G" contains an r-partite graph of size (1 — l/r + o(1))(4). 
(iii) G" contains an r-partite graph of minimal degree (1 — 1/r + o(1))n. 


The result above claims that if a graph G" not containing F has about as many 
edges as the Turan graph T(n), which trivially fails to contain F, then G" is very 
close to the graph 7,(n). For an extremal graph, considerably more is true. 


Theorem 1.5.2. Let F={F,,...,F,} be a fixed family of graphs, let r+1= 
min,x(F,) 22 and suppose that F, has an (r + 1)-colouring in which one of the 
colour classes contains t vertices. Let G" GC EX(n; ¥). Then, as n>, 


4ary= (1-20), 


the vertices of G can be partitioned into r classes such that each vertex is joined to at 
most as many vertices in its own class as in any other class, and for every e>0, 
there are at most c(e, ¥) vertices joined to at least en vertices of the same class. 
Furthermore, there are O(n’ ~'") edges joining vertices belonging to the same class, 
and each class has n/r + O(n?!) vertices, 


This result gives us a very good hold on extremal graphs. In fact, the function 
O(n?"'") can be replaced by O(ex(n; K(s,0) where s and ¢ are fixed. In 
particular, we have the following better bound on ex(n; F) in terms of ex(m; F,) 
for some bipartite graph F,. 


Theorem 1.5.3. Let F= F,+ K,. ,(u) where F, is a bipartite graph. Then 


ex(n; F) s (1 - ~)(5) + (r+ o(tnex( |= ; F,) ‘ 


As an illustration of the power of Theorem 1.5.2, let us present a beautiful 
theorem of Simonovits (1968) giving a complete solution to the forbidden 
subgraph problem for sK,,,, i.e., for s disjoint copies of K,,,, provided n is 
sufficiently large. 

What is a likely candidate for an extremal graph for sK,,,? If we add t— 1 
vertices to the Turan graph 7 = 7,(n —1¢ + 1) and join these vertices to each other 
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and to the vertices of T then the obtained graph, K,_, + T,(1 — t+ 1), has quite a 
few more (about (¢— 1)n/r more) edges than T (n), the extremal graph for one 
copy of K,,,, and still fails to contain 5 disjoint copies of K,,,. Indeed, every 
K,,, in K,_, + T,(n —¢ + 1) must contain at least one of the t — 1 vertices of K,_,. 
The following theorem of Simonovits (1968) shows that our hunch is essentially 
correct. 


Theorem 1.5.4. Let rand s be fixed natural numbers, r 22. If nis sufficiently large 
then K,_, + T,(n —t +1) is the unique extremal graph for tK,,,. 


Proof. Let us apply induction on ¢. The case t= 1 is precisely Turan’s theorem, so 
let us pass to the induction step. 

Let G=G" be an extremal graph of order n for tK,,,, and consider the 
partition V=V, UV,U---UV, guaranteed by Theorem 1.5.2. Set ¢ = 1/4tr. Let 
us distinguish two cases. 

Case (i) Some vertex x is joined to at least en vertices in its own class. Let W, be 
a set of m = fen] neighbours of x in V;. By Theorem 1.5.2 the r-partite subgraph 
of G spanned by W, UW, U---U W, has (1 ~ 1/r + 0(1))r?m?/2 edges so, rather 
trivially (or by Theorem 1.4.1, if we wish to conclude it instantly), it contains a 
K,(s) for s=¢t(r +1), provided a is sufficiently large. But then G—x cannot 
contain a (¢~ 1)K,,, since any (t— 1)K,,, could be extended to a ¢K,, , so we are 
done by the induction hypothesis. 

Case (ii) Every vertex is joined to at most en vertices in its own class. In this 
case, Our aim is to arrive at a contradiction. As 6(G) = (1 — 1/r + o(1))n, we may 
assume that every vertex is joined to all but at most 2en vertices in the other 
classes. As in case (i), this implies that for every pair {x, y} of vertices in the 
same class, in particular, for every edge xy joining vertices in the same class, the 
graph G contains a K,_,(s) for s = ¢(r + 1) such that both x and y are joined to all 
vertices of this K,_,(s). But then this implies that the graph H obtained from G 
by deleting all edges joining different classes contains at most ¢— 1 independent 
edges. 

Recall that the maximal degree of H is at most en. Let {x,y,,..-,%,¥,}, 
k <t—1, be a maximal set of independent edges in H. Since every edge of H 
meets the set {x,, ¥;,%2, You--+>X go Veh» We have e(H) <2ken < 2ten. But then 


t,(n) += — 1 <ex(ntK,,,) = eG) <t,(n) + 2ten , 


contradicting the choice of e, provided n is sufficiently large. 


A good many substantial general results concerning the structure of graphs in 
EX(n; F) were proved by Simonovits (1968, 1974). 

Another result based on Theorem 1.5.2, a theorem of Bollobas et al. (1978), 
shows the surprisingly great difference one edge can make. 

Let q be a prime power and let n=q’ +q +1. Let G be the graph obtained 
from K(n,n) by placing an Erdés—Rényi graph G,, described in the proof of 


q? 
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Theorem 1.3.5, in each of the classes. Thus 
e(G)=n’+q(qt 1)* = q' +39? + 5q°+3q+1 : 


As G, has maximal degree q + 1 and contains no C, = K(2, 2), the maximal ¢ for 
which G contains a K(2, 2, t) is precisely g + 1~ Vn. One more edge guarantees 
the existence of a K(2,2, |yn}) where y >0 is an absolute constant. 


Theorem 1.5.5. There is a constant qy such that if q = q, is a prime power and 
n=q'+q+1 then 


ex(2n; K(2,2,q+2))=n’+q(q+1). 


Furthermore, every graph of order 2n and size n° +q(q+1)' +1 contains a 
K(2, 2, t) with t=10°-°n. 


To conclude this section, we present a theorem of ErdOs and Simonovits 
(1983). This result is related to Theorem 1.2.3: it concerns the number of 
#-subgraphs of a graph with a» vertices and substantially more than ex(n; ¥) 
edges. Similarly to the notation k,(G) used earlier, given a family ¥ of graphs, 
denote by k,(G) the number of subgraphs of a graph G isomorphic to elements 
of ¥. Thus ex(n; ¥) = max{e(G): G has n vertices and kg(G) = 0}. The following 
result is a special case of a theorem of Erdés and Simonovits (1983), proved for 
hypergraphs. 


Theorem 1.5.6. Let ¥ be a finite family of graphs, with each F © & having at least 
t vertices. Then for every constant c >0 there is a constant c' > 0 such that if G is a 
graph with n vertices and at least ex(n; ¥) + cn® edges then k,(G)=c'n'. 


1.6. The asymptotic number of graphs without forbidden subgraphs 


Given a forbidden graph F, denote by f(a; F) the number of graphs on (n]= 
{1,2,...,} not containing F. What can we say about f(a; F) as n>? As 
always, we are particularly interested in the case F = K,. Extending earlier results 
of Erdés et al. (1976), Kolaitis et al. (1987) proved the following beautiful and 
sharp theorem. 


Theorem 1.6.1. For r=3, f(a; K,) is asymptotic to the number of (r — 1)-partite 
graphs on [a]. In particular, 


2 = 2 
f(n; K,) =2” 1)/2(r-1) + oC) fa 
= DEF PoC ext; K,) 


As we Shall see, Theorem 1.6.1 and a simple application of Szemerédi’s 
uniformity lemma (Theorem 1.4.5) enable one to determine the asymptotic value 
of log f(n; F) for every F of chromatic number at least 3. 
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Let us start with a trivial lower bound for f(n; F). {fa graph G on [n] does not 
contain our forbidden graph F, then no subgraph of G contains F and so 


fin, F)=2V™., 
Since G can be chosen to have ex(n; F) edges, we find that 
fn, Fy 22), 


Erdés et al. (1986) showed that this trivial bound is not far from being best 
possible. The key to this result is the following property of e-uniform and fairly 
dense pairs (see Theorem 1.4.5 and the paragraph preceding it). 


Lemma 1.6.2. Let f 21, r=2 and O<e<(r-1) '”, and let V,,...,V, be 
disjoint subsets of the vertex set V(G) of a graph G with (1- e)e/"'|V,| = 1, 
i=1,...,r. Suppose that each pair (V,,V,) is © ‘uniform with density at least 
ete’. Then G contains every r-partite graph on f vertices. 


Proof. We apply induction on f. As for f= 1 there is nothing to prove we turn to 
the induction step: we assume that f =2 and that the lemma holds for smaller 
values of f. 

For every i, 2<i<r, the set V, has at most e/|V,| vertices joined to fewer than 
(dV, V,)- WV > > lV, vertices of V,. Hence there are at least (1-(r—- 
1)e Mi | >0 vertices in V,, each of which is joined to at least e|V,| vertices in V,, 
i=2,...,r. Let x, be such a vertex and set W,=V,\{x,} and W,=I(x,) NV, 
i=2,...,r. Then |W,|=|V,|—12el|V,| and wl > elv| for i=2,...,r. Hence, 
the sets W,, ..., W, satisfy the conditions of the lemma with f replaced by f~-1. 
As x, is joined to all vertices in Uj_,W,, we are done by the induction 
hypothesis. DO 


We know from section 1.5 that the structure of an extremal graph for F is 
rather close to the structure of an extremal graph for K,, where r= y(F). The 
following theorem of Erdés et al. (1986) claims that any graph not containing I 
can be turned into a graph not containing K, by the deletion of a few edges. 


Theorem 1.6.3. For every « >0 and graph F there is a constant ny = ny(e, F) with 
the following property: Let G be a graph of order n=nj, not containing F as a 
subgraph. Then G contains a set E' of at most en’ edges such that G\E' contains 
no K,, where r= x(F). 


Proof. We may assume that r=3 and e<2/(r—1). Set f=|Fl, m= [3/e] and 
&,=£/4. Let M=M(ef,m) be the constant guaranteed by Szemerédi’s uni- 
formity lemma (Theorem 1.4.5). 

We claim that n, = no(e, F) = [(M + 1)/e/] will do. Indeed, let G be a graph of 
order n =n, not containing F. By Theorem 1.4.5 there is a partition U*_, V, of 
V(G) into disjoint sets such that m<k <M, |V,|<|V,|[={V,[=---=[V,|, and all 


ANS TA RTE, 
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but at most e/k’ of the pairs (V,, V), l<i<j<k, are e/,-uniform. Let E’ be the 
union of the following sets of edges: 

(1) the edges meeting V,, 

(2) the edges joining two vertices of V,, i=1,...,7, 

(3) the edges joining V, to V, for every pair (V,, V,) which is not ef-uniform, 

(4) the edges joining V, to V, for every pair (V,,V,) of density less than €, + E5- 

By Lemma 1.6.2, the graph G\E’' contains no K, since otherwise it would 
contain F as well. Hence all we have to check is that £’ is small enough. This is 
indeed the case: 


2 


n nik n 
lEl=z57+ K( 4 ) + EK (nIky + (eq + 22)(5) 
Liza 
<nlg tag tht rl 
3 
<4 Eh on? go 


From here it is a short step to the theorem of Erdés et al. (1976) concerning 
f(a; F). 
Theorem 1.6.4. Let F be a graph with r = y(F) 23. Then 


“KF = i 2 
fin; | oh AR eal) 2)/2(r~ Ato dl)ja , 


Proof. Theorem 1.6.3 implies that if ¢ >0 and n is sufficiently large then 


flns F)<f(n; K,() <f(n, K,)(2/ey” 


En 


Hence, by Theorem 1.6.1, 
fn; F) = 2a +o(1))ex(N; K,) +0(n?) = 20 +o(1))ex(t: F) ; o 
It is easily seen that Theorems 1.6.3 and 1.6.4 hold for families of forbidden 
graphs. Thus if % = {F,,...,,}, with 
min x(F)>3, 
then, with the obvious definition, 
flr; F) = fly Fy... Fy) Brennen) 


It is interesting to formulate the last assertion in terms of monotone properties. 
A property ? of graphs is an infinite class of (finite) graphs which is closed under 
isomorphism. A property # is said to be monotone if every subgraph of every 
member of # is also in Y, and it is hereditary if every induced subgraph of every 
member of # is also in ¥. Thus every monotone property is also hereditary; 
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furthermore, the intersection of a family of monotone hereditary properties is 
monotone, and the intersection of hereditary properties is hereditary. 

Monotone properties are characterized by forbidden subgraphs. Indeed, given a 
family ¥ of finite graphs, let A, be the class of graphs having no subgraph 
isomorphic to a member of ¥. If P, is infinite then it is a monotone property; 
conversely, every monotone property is obtained in this way. 

A monotone property is principal if it is obtained by forbidding a simple graph. 
Clearly, every property is the intersection of a (possibly infinite) family of 
principal properties. 

Let us write Y” for the set of graphs in Y with vertex set |[n]. Thus 
f(n; ¥) =|P5|. The remarks above concerning f(n;¥) have the following 
reformulation. 


Theorem 1.6.5. Let P,, P,,... be monotone properties and set P = (\P,. Then 
\P"| = 22 1pn| 
for some k. In particular, 
"| =2019"| 
for some principal monotone property 2 containing P. 
Returning to f(n; F), ict us note that it is not known whether Theorem 1.6.4 
holds for every bipartite F as well. In fact, it is not even known whether Theorem 


1.6.4 holds for a 4-cycle C,. Since, by Theorem 1.3.5, ex(n;C,)~4n*?, one 
would like to show that 


fn; C,) os 90 f2+0(1)pn!2 ; 


While the right-hand side is a (trivial) lower bound for f(n; C,), the best upper 
bound, due to Kleitman and Winston (1980), is only 2 , with c about 1.08. 


1.7. The asymptotic number of graphs without forbidden induced subgraphs 


Recently Prémel and Steger studied the structure and number of graphs without 
induced forbidden subgraphs. Given a graph F, let f*(n; F) be the number of 
graphs on [a] containing no induced subgraph isomorphic to F (briefly, containing 
no induced F). 

At least how large is f*(n; F)? Suppose that there are integers k and / such that 
no k-partite graph, in which / of the classes have been replaced by complete 
graphs, contains an induced F. Then, clearly, 


f*(n; F) > yk 1)/2k +o yn? ; 


since the classes can be chosen to be almost equal and the edges between the 
classes can be freely chosen. 
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Prémel and Steger (1992, 1993a,b) proved that this simple lower bound is 
essentially best possible. Let 7(F) be the maximal integer r such that for k =r—1 
there is an J as above. This somewhat convoluted definition is explained by the 
fact that 7(F) is something like the chromatic number y(F), which is the maximal - 
integer r such that for k =r-—1 no k-partite graph contains F. So the following 
result, whose proof is based on a generalization of Szemerédi’s uniformity lemma 
to hypergraphs, is the exact analogue of Theorem 1.6.4. 


Theorem 1.7.1. Let F be a graph with r=7(F) 23. Then 


f*(n; F) = QOr-2)/2F 1) + OC)? 
? . 


For the case F=C,, Prémel and Steger (1991) proved much more precise 
results. It is easily seen that 7(C,) = 3. Indeed, if V(G) is the disjoint union of the 
sets V, and V,, with G[V,} complete and V, an independent set (such graphs are 
known as split graphs), then G does not contain an induced C,. Hence, by 


Theorem 1.7.1, we have f*(n;C,)= 204°" In fact, considerably more is 
true. 


Theorem 1.7.2. (i) Almost every graph containing no C, is a split graph: 
f*(n; C,) is asymptotic to the number of split graphs on |n]. 
(ii) There are positive constants c, and c, such that 


ft(n; C,) 25 e(2"""*")in' 2 , 


where j =n (mod 2). 


What happens if we forbid a family ¥ = {F,, F,,...} of finite graphs as induced 
subgraphs? Rather surprisingly, unlike the case of forbidden subgraphs, forbid- 
ding a family ¥ induced subgraphs is very different from forbidding just one of 
them. Let #=%,. be the class of graphs containing an element of ¥ as an 
induced subgraph. If A. is infinite then it is a hereditary property; conversely, 
every hereditary property is obtained in this way. 

The growth of |#"| for a hereditary property # depends on the colouring 
number r(P) of Y, defined somewhat similarly to 7(F). An (r,s)-colouring of a 
graph H is a map w: V(H)— [r] such that H[”'(é)] is complete for 1 <i<s and 
is empty for s + 1<i<r. Thus s of the colour classes induce complete graphs and 
r—s of them induce empty graphs. The colouring number r(P) of a property P 
of graphs is the maximal r for which there is an s, 0s <r such that every 
(r, s)-colourable graph has property ?. Equivalently, r(P,.) = max{r: for some s, 
O0=<s <r, no FE F is (r,s)-colourable}. Note that 


MPs.) > inf (r(F) — 3}, 


with equality if ¥ = {F} but, in general, the inequality may be strict. 
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Alekseev (1993) and Bollobas and Thomason (1994b) determined the asymp- 
totic size of A" for a hereditary property, thereby extending Theorems 1.6.4 and 
1.7.1, concerning principal properties. 


Theorem 1.7.3. Let ? be a hereditary property of graphs and let P” be the set of 
graphs in P with vertex set [n]. Then 
\P"| a Ql ter Foye 


’ 


where r=r(P) is the colouring number of . 


This result implies that the analogue of Theorem 1.6.5 does not hold for 
hereditary properties: the intersection of two hereditary properties may be 
substantially smaller than either of the properties. For example, if #, = {K,}, 
F, ={Cj}, P= Ps, t=1,2, and P=P,NY, then 


|Pr| = gil tod yats3 


for i= 1,2, but 


|\P"| = giro yyn?ss : 


In conclusion, let us note that the analogous problem for uniform hypergraphs 
is unsolved. If P is a property of k-graphs (k-uniform hypergraphs) then, as 
implied by some results of Alekseev (1982) and Bollobaés and Thomason (1994a), 


\P"| me yeeroan(Z) 


for some constant c. However, for r 23 the possible values for c are not known. 


2. Cycles 


In section I we discussed the forbidden subgraph problem for a fixed family of 
forbidden graphs ¥ and found this problem to be fairly well understood, provided 
# contains no bipartite graph. What can we say about graphs of order n not 
containing any member of a family %, of forbidden graphs, where %, depends on 
n? The most frequently studied and best understood case of this problem is when 
#, consists of cycles. In this section we shall discuss some of the results 
concerning this problem. 


2.1. Hamilton cycles 


What values of various graph parameters ensure that a graph has a Hamilton 
cycle? Let us start with the number of edges ensuring a Hamilton cycle: what is 
ex(n; C,,)? Since a Hamiltonian graph has minimal degree at least 2, every graph 
of order n and size ex(n;C,,) +1 must have minimal degree at least 2. It is 
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immediate that the minimal number of edges ensuring that a graph of order n has 
minimal degree at least 2 is ("3 ') + 2: adding a vertex x to K,_, and joining x to 
one vertex of K,_, we obtain the unique graph of order n, size ("7')+1, and 
minimal degree at most 1 (and so precisely 1). A moment’s thought shows that 
the Hamilton cycle problem has the same solution: ex(n; C,) = ("7 ') +2, with 
the same extremal graph. 

Although this seems somewhat disappointing, all it shows that the size in itself 
is not very effective in forcing a Hamilton cycle. The minimal degree is 
considerably better. (Contrast this with the remarks following Theorem 1.1.1 in 
the previous section.) Dirac (1952) proved that a graph of order n and minimal 
degree at least n/2 is Hamiltonian; the graph K([(n — t)/2], LQ + 1)/2]) shows 
that the result is best possible. This theorem of Dirac started the search for 
various degree conditions that, coupled with some other conditions, like a bound 
on the connectedness, imply that the graph is Hamiltonian. 

As shown by Ore (1960), Dirac’s theorem is implied by the following simple 
lemma, essentially due to Dirac. 


‘Lemma 2.1.1. Let x, and x, be non-adjacent vertices in a graph G of order n such 


that d(x,) + d(x,)2n. Then G is Hamiltonian iff G + x,x,, is Hamiltonian. 


Proof. Suppose there is a Hamilton cycle in G+ x,x,. If this cycle does not 
contain x,x, then G is Hamiltonian so we are done. Otherwise G contains a 
Hamilton path x,x,--+x,. Since d(x,) + d(x,) =n, there is an index i, 2<i<n, 
such that x, is joined to x, and x, is joined to x;_,. But then x,0,2.°°*%;_,X%,Xq—4 
-++x,; is a Hamilton cycle. O 


Thus if a graph G is not Hamiltonian and x, y are non-adjacent vertices such 
that d(x) + d(y) =n then G' = G + xy is not Hamiltonian either. Of course, if in 
G' we can find non-adjacent vertices x’, y’ such that d’(x’) + d'(y’) =a, where d’ 
denotes the degree in G’, then G’=G' + x'y’ is not Hamiltonian cither, and so 
on. This led Bondy and Chvatal (1976) to introduce the k-closure of a graph. The 
k-closure C,(G) of a graph G is the minimal graph H containing G such that for 
any two non-adjacent vertices x, y of H we have d,,(x) + d,,(y) <k — 1. In other 
words, C,(G) is the unique graph obtained from G by successively joining all 
vertices the sum of whose degrees is at least k. Call a property P of graphs 
k-stable if whenever x, y are non-adjacent vertices of G such that d(x) + d(y) =k, 
and G + xy has property P then so does G. By definition, if P is k-stable and 
C,(G) has P then G has P. 

Lemma 2.1.1 states precisely that the property of being Hamiltonian (for 
graphs of order n) is n-stable. (In fact, the proof of Lemma 2.1.1 shows that the 
property of containing a cycle of length at least k is also n-stable; and it is easily 
seen that the property of containing a path of length at least / is (#~ 1)-stable.) 
Thus if C,(G) is Hamiltonian so is G. In particular, Lemma 1.1.1 implies Dirac’s 
theorem, from whose proof the lemma was distilled. 
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Theorem 2.1.2. Let G be a graph of order n =3 and minimal degree at least n!2. 
Then G is Hamiltonian. 


Proof. Note that C,(G) is the complete graph K,,. Since K,, is Hamiltonian, so is 
G. oO 


The closure operation enables one to prove the theorem of Las Vergnas (1971) 
for the existence of a Hamilton cycle. 


Theorem 2.1.3. Let G be a graph with vertex set {x,,X,...,X,}- Suppose there 
are no indices i and j such that x,x, is not an edge, d(x;) + d(x,)<n— 1, d(x,) Si, 
d(x;) <j — 1 and j 2>max{i+1,n—i}. Then G is Hamiltonian. 


As an immediate consequence of this result, one obtains Chvatal’s (1972) 
theorem answering a very natural extremal question concerning Hamilton cycles: 
what sequences d,, d,,...,d, guarantee that if the ith vertex of a graph G of 
order n has degree at least d; then G is Hamiltonian? By Dirac’s theorem, [n/2], 
{n/2],..., fn/2] is such a sequence. 


Theorem 2.1.4. (i) Let d,<d,<---<d, be the degree sequence of a graph of 
order n= 3. Suppose 


dy =k <5 implies d, ,2n~k. (1) 


Then if G has vertex set {x,,X,,...,x,} and d(x;)=d, for every i, then G is 
Hamiltonian. 
(ii) If (d,)] is the degree sequence of a graph and (1) fails then there is a 


non-Hamiltonian graph with vertex set {x,,X,...,%,} such that d(x;)=d, for 
every i. 


Analogous results hold for Hamilton paths: if C,_,(G) has a Hamilton path 
then so does G, and condition (1) gets replaced by the condition that d, <k -—1< 
4(m — 1) implies that d,,,_, 22 —k. 

There are numerous other sufficient conditions for a graph to be Hamiltonian 
that do not demand that the vertices have very large degrees. The first notable 
result of this kind was proved by Nash-Williams (1971). Let us write a(G) for the 
independence (or stability) number of a graph G, i.e., for the maximal cardinality 
of an independent set of vertices. 


Theorem 2.1.5. Let G be a 2-connected graph of order n and minimal degree 
A(G) = (n + 2)/3. If 8(G) 4 a(G) then G is Hamiltonian. 


In proving Theorem 2.1.5, Nash-Williams made use of the following important 
lemma. 
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Lemma 2.1.6. Let C be a longest cycle in a non-Hamiltonian graph G with n 
vertices. If G—C has a component with at least 2 vertices then 5(G) <(n + 1)/3. 


This lemma has several extensions, including those by Jackson (1980) and Jung 
(1984). . ' 

Haggkvist (1980, 1989) proved the following deep and useful characterization 
of Hamiltonian graphs of fairly large minimal degree. 


Theorem 2.1.7. Every 2-connected non-Hamiltonian graph with n vertices and 
minimal degree 5 2 4(n — 1) contains a set S of m2 36 —n +2 > 4n vertices such 
that in the graph G ~ S the vertex set cannot be covered by m paths. 


Note that in Theorem 2.1.4 one allows d(x,) to be strictly greater than d,;. As 
the following beautiful theorem of Jackson (1980) shows, if we demand that the 
graph is 2-connected and every vertex has degree precisely d, then a rather small 
value of d guarantees that the graph is Hamiltonian. The proof of this theorem is 
based on Jackson’s extension of Lemma 2.1.6. 


Theorem 2.1.8. Let G be a 2-connected d-regular graph of order n. If d= 4n then 
G is Hamiltonian. 


The Petersen graph shows that, as stated, Theorem 2.1.8 is best possible, at 
least for d =3. It is casily scen that it is close to being best possible for every 
d23. 

What happens if our graph is not only 2-connected but also k-connected for 
some k =3? At first sight it seems likely that a considerably smaller degree of 
regularity will suffice to imply that the graph is Hamiltonian. In particular, as 
conjectured by Bollobas (1978a, p. 167, Conjecture 36), it seems likely that if G 
is a d-regular k-connected graph with n vertices and d2=n/(k+1) then G is 
Hamiltonian. Jackson and Jung showed that this is false for & = 4. 

The examples indicate that for a fixed value of &, k-connectedness is hardly any 
more use in finding Hamilton cycles in regular graphs than 3-connectedness. 
However, the conjecture may well be true for k =3: if G is a 3-connected 
d-regular graph with n vertices and d2=n/4 then G is Hamiltonian. This was 
conjectured by Haggkvist as well. 

Recently Li Hao (1989a) took the first step towards proving this conjecture by 
showing that if we demand 3-connectedness then the degree of regularity can be 
allowed to drop substantially below the n/3 bound in Theorem 2.1.8. 


Theorem 2.1.9. Let G be a 3-connected d-regular graph of order n. If d= in then 
G is Hamiltonian. 


Note that Theorem 2.1.5 is another extension of Theorem 2.1.1. The following 
rather simple result in the vein of Theorem 2.1.5 is due to Chvatal and Erdés 
(1972). 
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Theorem 2.1.10. Suppose G has at least three vertices and it is a(G)-connected. 
Then it is Hamiltonian. 


Proof. Let k = a(G). Then k =2 so G has a longest cycle C. Then [C| = 8(G) + 
12=k+1. Assume that C is not a Hamilton cycle, i.e., there is a vertex 
x€G—C. Since G is k-connected, there are k independent paths from x to C, 
i.c., there are x — x, paths (= 1,...,k) such that any two of them have only the 
vertex x in common, and any one of them has only the vertex x; on C. 

Giving C some orientation, let x; be the successor of x, on C for i=1,...,k. 
Then, since C is a longest cycle, the set S = {x,x,,4,,...,x,} iS an independent 
set, contradicting our assumption that a(G)<k. Hence C is a Hamilton 
cycle. U 


Given a set S of vertices of a graph G, denote by N(S) the set of neighbours of 
S: N(S) = {x EG: xy € E(G) for some y ES}. Fraisse (1986) proved the follow- 
ing essentially best possible condition for a k-connected graph to be Hamiltonian. 


Theorem 2.1.11. Let G be a k-connected graph of order n. Suppose that 


|N(S)| > k(n — 1)/(K +1) whenever S is an independent set of k vertices. Then G is 
Hamiltonian. 


The following graph constructed by Skupien (1979) shows that Theorem 2.1.11 
is close to being best possible: let n = (k + 1)q + & and let G be obtained from the 
vertex-disjoint union of K, and k + 1 copies of K,, by joining each vertex of K, to 
every other vertex. Then G is a k-connected non-Hamiltonian graph of order n, 
in which any k independent vertices have n—k-—gq=kq=k(n—k)/(k +1) 
neighbours. 

Recently Hiiggkvist (1989) proved the following substantial extension of 
Theorem 2.1.5. 


Theorem 2.1.12. Let G be a non-Hamiltonian 2-connected graph of order n, 
independence number a <(n + 1)/2 and minimal degree 6 =(n + 2)/3. Then, for 
every k, 1<k<6 +1, there exists an independent set S of k vertices such that 


~ — |N(S)| <max{a — 1,n-26+k-2}. 

A consequence of Theorem 2.1.12 is that if G is a 2-connected non-Hamilto- 
nian graph of order 1 with minimal degree 6 =(n +2)/3 then it contains an 
independent set of at least (n + 14)/6 vertices with at most (nm — 1)/2 neighbours 
in total. 

2.2. Edge-disjoint Hamilton cycles 


Suppose the conditions on some set of graph parameters imply that our graph 
must contain a Hamilton cycle. Does our graph have to have many Hamilton 
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cycles? Docs it have to have many edge-disjoint Hamilton cycles? The following 
striking theorem of Nash-Williams (1971), whose proof is based on Theorem 
2.1.6, shows that this is the. case if the parameter is the minimal degree. To be 
precise, Nash-Williams proved the following substantial extension of Dirac’s 
theorem, Theorem 2.1.2. 


Theorem 2.2.1. Let G be a graph of order n and minimal degree at least n/2. Then 
G contains a set of |5(n + 10)/224] edge-disjoint Hamilton cycles. 


Once again, if we demand that our graph be regular then we can guarantee 
considerably more edge-disjoint Hamilton cycles. Jackson (1979) made use of his 
Theorem 2.1.8 to deduce the following result. 


Theorem 2.2.2. Let G be a d-regular graph of order n= 14. If d2(n —1)/2 then 
G contains a set of |(n — 1)/2] edge-disjoint Hamilton cycles. 


Theorem 2.2.1 is rather far from being best possible. In the case when the 
minimal degree is a little larger than n/2, Haggkvist (1990) proved the following 
deep results that are essentially best possible. 


Theorem 2.2.3. Let A> 4. If n is sufficiently large and G is a graph of order n and 
minimal degree at least An, then G has a set of |n/8] edge-disjoint Hamilton 
cycles. 


Theorem 2.2.4. Let A> 4. If n is sufficiently large and G is a d-regular graph of 
order n, where d is an even integer not less than An, then G has a Hamilton 
decomposition, i.e., the edge set of G can be partitioned into d/2 Hamilton cycles. 


To see that, in some sense, Higgkvist’s theorem 2.2.3 is essentially best 
possible, consider the following graph G given by Nash-Williams (1970). Take the 
complete bipartite graph with vertex sets U={u,,...,Ug4,} and W= 
{w,,..-,Wa,-,}, and add to it the edges u,u,, Uytlg, alles... Uy Ugg and 
UgpUg,4,- The obtained graph G has 1 = 8k vertices and minimal degree 2k. Note 
that every Hamilton cycle in G has to contain two of the 2k + 1 edges in U, so G 
has at most [(2k + 1)/2] =k =n/8 edge-disjoint Hamilton cycles. 

Li Hao (1989b) proved a conjecture of Faudree and Schelp that if Ore’s 
condition in Lemma 2.1.1 is satisfied and the graph has small minimal degree then 
there are many edge disjoint cycles. 


Theorem 2.2.5. Let G be a graph with n vertices and minimal degree 6 such that 
n= 26° and the degree sum of any two non-adjacent vertices is at least n. Then the 
graph contains k = ((6 — 1)/2| edge disjoint cycles of lengths 1,, 1,,...,1,, for all 


3s1,<1,<---<1, <n. 
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2.3. Long cycles 


For a graph G, let C(G) be the set of lengths of cycles in G. The circumference of 
G is the length of a longest cycle: c(G) = max C(G), the girth of G is the length of 
a shortest cycle: g(G) = min C(G). What do various natural graph parameters 
(size, minimal degree, connectivity, etc.) tell us about c(G), g(G) and C(G)? 
Let x,x, +--+ x, be a longest path in a graph G, and let k = max{i: x, is joined to 
x;}. Then k >d(x,)+ 1286(G)+1 so, in particular, if 6(G)=2 then c(G)= 
6(G) + 1. This trivial observation was strengthened considerably by Alon (1986) 
to a result including Dirac’s theorem (Theorem 2.1.2): if 6(G)2=n/k then 
c(G) = [n/(k — 1)]. The theorem was extended slightly by Egawa and Miyamoto 
(1989) and Bollobas and Haggkvist (1990) to the following best possible result. 


Theorem 2.3.1. Suppose 2<k <n are integers and G is a graph of order n and 
minimal degree at least n/k. Then c((G)2=n/(k — 1). Furthermore, for 2<k<n 
there is a graph G of order n such that 6(G) = [n/(k — 1)] — 1 and C(G) = [n/(k — 
1)). 


In fact, recently Bollobas and Brightwell (1993) extended Theorem 2.3.1 to the 
following result, whose proof turned out to be considerably easier than the proofs 
of Theorem 2.3.1. 


Theorem 2.3.1’. Let G be a graph of order n with a set W of w =3 distinguished 
vertices. Suppose that every vertex of W has degree at least d=2 and let 5 = [w/ 
({n/d| — 1] =3. Then there is a cycle in G containing at least s vertices of W. 


If we demand that our graph is 2-connected then we can guarantee a 
considerably longer cycle: as proved by Dirac (1952), if G is 2-connected then 
c(G) = min{|G], 26(G)}. The following extension of a theorem of Pdsa (1963) 
was proved by Bondy (1971a). 


Theorem 2.3.2. Let 3<c<n and let G be a 2-connected graph of order n with 
vertex set {x,,X2,...,X,} such that 2<d(x,)<d(x,)<---<d(x,). Suppose also 
that if d, <k<c/2,k<l, d,<land x,x,€E(G) thenk+12c+1. Then (G)= 
c. 


Bondy proved also that if in a graph of order m the degree sum of any three 
independent vertices is at least m2=n+2 then c(G)>=min{n, 2m/3}, and 
conjectured the following much stronger result, proved by Fournier and Fraisse 
(1985) (cf. Theorem 2.1.8.). 


Theorem 2.3.3. Let G be a k-connected graph of order n, where k =2, such that 
the degree sum of any k +1 independent vertices is at least m. Then c((G)> 
min{n, 2m/(k + 1)}. 
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Erd6s and Galtai (1959) determined the minimal size of a graph of order n 
guaranteeing that the circumference is at least c. 


Theorem 2.3.4. Let 3<c<n. Then the circumference of a graph of order n and 
size |(c — 1)(2 — 1)/2] +1 at least c. 


A graph G of order n is pancyclic if C(G) =[3,n} = (3,4,...,”}, be., if G 
contains a cycle of every possible length. We do know that. |n/4| edges do not 
guarantee a triangle C,, and’ many more edges are needed to guarantee a 
Hamilton cycle. However, as the following theorem of Bondy (1971b) shows, if a 
graph has more than |n7/4J edges then a cycle of length [> 3 guarantees a cycle 
of length /~ 1. 


Theorem 2.3.5. Let G be a graph of order n with more than |n’/4| edges. Then 
c(G) = | $(n+ 3)} and C(G) = [3, c(G)J. In particular, if G is also Hamiltonian 
then it is pancyclic. 


How large a minimal degree ensures that a graph G of order n is pancyclic? In 
view of Theorem 2.3.5 the answer is [n/2] + 1, the degree ensuring the existence 
of a triangle. If G is not bipartite then, as proved by Haggkvist (1982), already 
8(G) = (2m + 1)/5 ensures the existence of a triangle. Amar et al. (1983) proved 
that if G is also Hamiltonian, then the same condition guarantees that the graph is 
pancyclic, and Shi (1986) showed the following slight extension of this result. 


Theorem 2.3.6. Let G be a non-bipartite Hamiltonian graph of order n such that 
for any two non-adjacent vertices x and y we have d(x) + d(y) = (4n + 1)/S. Then 
G is pancyclic. 


It is easily seen that Theorem 2.3.6 is best possible. Indeed, let G be the 
2k-regular graph of order n = 5k with vertex set V= U}_, V, where |V,|=--- = 
|V;| =k and with edges joining V, to V,,, for 1=1,...,5, where V, =V,. Then G 
is not pancyclic because it contains no 4-cycles. 

Woodall (1972) determined the minimal number of edges ensuring that a graph 
G of order n and minimal degree 6 satisfies C(G) > [3,/]. Here we state only a 
consequence of this result. 


Theorem 2.3.7. Let 3=(n + 3)/2/ 2 and let G be a graph of order n and size 
oa Coe) Fi 
Pe es ar oa a 
Then C(G) > (3, l]. The bound is best possible. 


Although a graph with fewer than |n’/4] edges cannot be guaranteed to have 
any odd cycles, it can be guaranteed to have even cycles, both short and long. The 
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following deep and almost best possible result was conjectured by Erd6és (1965) 
and proved by Bondy and Simonovits (1974). 


Theorem 2.3.8. Let k be a natural number. Every graph of order n and size at least 


90kn'*'* contains a cycle of length 21 for every integer | in the interval k <1 < 
kn''*, 


2.4. Girth and diameter 


What forces a graph to have small girth, i.c., short cycles? Many edges, or almost 
equivalently, large minimal degree. To study the connection between the minimal 
degree and the girth, for natural numbers 6 <2 and g >3 define 


n(g,5)=min{{G|: e(G) 2g and 6(G) 25}. 


A graph of minimal degree 4, girth at least g and order n(g,6) is said to be a 
(5, g)-cage. 

It is not entirely immediate that n(g,6)<, i.e., there are finite graphs of 
arbitrarily large girth and arbitrarily large minimal degree. However, this does 
follow from a simple argument using random graphs. 

A cycle of length g shows that n(g,2)=g so we shall assume that 6 23. By 
estimating the number of vertices at distance d from a vertex or from an edge, 
one gets the following trivial lower bound on n(g, 6). 


Theorem 2.4.1. [f 5 3 then 


Claes © asada | 

1+ 6-55 if g is odd, 
2(6=1)*" =2 at 

ae =, oe if g is even. 


n(g,6)> 


It is easily seen that in Theorem 2.3.1 equality holds for 6 = 3, g=3, 4, 5, 6 
and 8, and for g=4 and all 6 23. For example, n(5,3) = 10 is shown by the 
Petersen graph; the extremal graph for z(7;4) = 21 (see Theorem 1.3.3) shows 
that n(6, 3) = 14 (thus the vertices are the 7 points and 7 lines of the projective 
plane PG(2,2), with a point joined to a line if they are incident); the graph 
K(8, 5) shows that n(4, 8) = 26. 

Suppose that g 23, 6 =3 and G, is a graph showing that equality holds in 
Theorem 2.4.1. If g is odd, say g =2D + 1, then Gy, is 5-regular and has diameter 
D; also n(g, 6) is the maximal order of a graph with maximal degree at most 5 
and diameter at most D. If g=2D +2 then G, is 5-regular and every vertex is 
within distance D of every edge (in fact, of every pair of vertices); also n(g, 6) is 
the maximal order of a graph with maximal degree at most 6 in which every 
vertex is within distance D of every edge. Such a graph G, is called a Moore 


graph of girth g and degree 8. (if g = 2D + 1 then G, is also called a Moore graph 
of diameter D and degree 6.) 
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There are very few Moore graphs. Results of Hoffman and Singleton (1960), 
Karteszi (1960), Feit and Higman (1964), Singleton (1966), Bannai and Ito 
(1973) and Damerell (1973) show that if there is a Moore graph of girth g = 5 and 
degree 6 = 3 then either g = 5 and 6 = 3, 7 or 57, or else g = 6, 8 or 12. For g =6 
and 8 there is a Moore graph for each finite projective geometry of order 6 and 
dimension 2 and 3. ; 

As there are so few graphs attaining the trivial lower bound in Theorem 2.4.1, 
what about graphs showing that n(g,5) is not much larger than the trivial lower 
bound. Such graphs are not easy to come by either. The following theorem was 
proved by Erdés and Sachs (1963) without explicitly constructing a graph showing 
the inequality. 


Theorem 2.4.2. If g 23 and 5 23 then 


F) 
a Fag (6-1) T- 1} if g is odd, 
n(g,0)< 


5-7 16 1)* 7-1} if g is even. 


Note that for large values of g the upper bound given in Theorem 2.4.2 is about 
the square of the trivial lower bound in Theorem 2.4.1. This huge gap was 
narrowed by Margulis (1982) by an explicit construction: a most welcome success 
of constructive algebraic methods. 

Let p5 be a prime and consider SL,(Z,), the multiplicative group of 
paar iar 2 by 2 matrices with entries from the field Z,. Let A=({  {) and 

=( {) be elements of SL,(Z,). The Margulis graph M(4, Pp) is the Cayley 
en over SL,(Z,) with respect to the set (A, B, A’, B’'Y, i.e., M(4, p) has 
vertex set SL,(Z, y with a matrix C joined to a matrix D iff C-'DE{A, B, A™ 
B™'}. Margulis proved that the graph M(4, p) has rather large girth. 


Theorem 2.4.3. Let a = 1+ V2, k EN and let p = 2a* be a prime. Then the graph 
M(4, p) is a 4-regular graph of order p(p*’ — \) and girth at least 2k +1. 


Note that for large n = p( p* — 1) the Margulis graph M(4, p) has girth about 
(2/3 log a) logn=log,n, where b =a’? =3.751..., while Theorem 2.4.2 
guarantees only a graph of girth about log,n. 

Margulis (1982) used the same method to construct regular graphs of large girth 
and arbitrary even degrees. Following Margulis, Imrich (1984) constructed Cayley 
graphs of factor groups of some subgroups of the modular group to improve the 
bound in Theorem 2.4.3. 


Theorem 2.4.4. For every r >2 one can effectively construct infinitely many Cayley 
graphs with n vertices and girth at least 


0.4801... log n)/log(d — 1) - 2 
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Furthermore, for r=3 one can have girth at least 
0.9601... (log n)/log2—5. 


It would be of interest to find other explicit constructions for graphs of large 
girth and large minimal degree. 


2.5. The set of cycles in graphs of given minimal degree 


A graph G of minimal degree 6 =2 contains at least 6 — 1 cycles of different 
lengths, i.e., |C(G)| = — 1. Indeed, let x,x,---x, be a longest path in G and let 
X2, Xj,1 Xj,,-+-.%;,, be the neighbours of x; ‘ahen cee and for every j, 
l<j<k, the graph has a cycle of length ¢,, namcly x XQ0 . The graphs K,,, 
and K(6,5) show that this trivial bound on {C(G)| in ‘cris of 6 cannot be 
improved in general. 

However, if 6(G) = 6 23 and G has large girth then it is easily seen that |C(G)| 
has to be (much) larger than 6 — 1. This suggests that if short cycle lengths are 
taken with large weights and long cycle lengths are taken with small weights, then 
the total weight of cycle lengths has to be large if the minimal degree is large. 
Erdés and Hajnal (see Erdés 1975) proposed taking a cycle of length r with 
weight 1/r. For a graph G, let 


S(G) = S(C(G)) = > (l/r: rE C(G)} . 
How large is then 
fk) = inf(S(G): 6(G) =k}? 


The graph K, ,, k 22 has minimal degree k and its set of cycle lengths is {4, 
6,...,2k} so, as k>%, 
SL 


fk) < S(Ky) =4 > 


=(1 
paar (4 + 0(1))log k . 
Erd6és and Hajnal conjectured that f(k) is of order logk. To appreciate the 
difficulty in proving this conjecture, note that it seems to be difficult to prove that 
fk) & as k > &, 

This conjecture was proved by Gyarfas et al. (1984). 


Theorem 2.5.1. There are positive constants c and & such that if 5(G)2=c then 
S(G) 2 € log &(G). 


The ingenious and beautiful proof makes good use of the so-called (k, a)-trees. 
Let T be a rooted tree of height h and levels L,, L,,..., L,, where the ith level 
L, of T is the set of vertices at distance i from the root. This tree T is said to be a 
(k, a)-tree if for i<h every vertex x at level i has at most k neighbours at level 


An impo 
assertion. © 


Fi 


Theorem 2.53 


x ; "Bx 
bipartite grap 2 +4 
4m/7 even = 
a 
= 
As an immé ee, 
another conjed ee 
& & 
aan 
Theorem 2.5.3. By 
has positive upp Ox 
ay 
: \ 
Proof. By a sim a a 
finite subgraph & = 


maximal size has 


Theorem 2.5.1 cai 
degree of G. For 


h(a) = in: 


Since every graph G satisfying e(G) = a|G|, ic., having average 
2a, has a subgraph of minimal degree at least a, Theorem 2.5. 
following result. 


Theorem 2.5.1’. There are positive constants c and ¢ such that if ac 
h(a) = € log a. 


This result gives no information about h(a) for smalt values of a. Trivially, 
h(a) =0 for a <1 but a priori it is not clear that there is no a,>1 such that 


h(a) = 0 for a <a. Gyarfas et al. (1985) proved that, in fact, f(a)>0 for every 
a>}. 


Theorem 2.5.4. If k is sufficiently large then h(1 + 1/k) 2 (300k log k)"' . 


3. Saturated graphs 


A property P of graphs is monotone increasing if whenever a graph G has P, so 
does every graph obtained from G by the addition of some edges. Clearly, if &, is 
the set of minimal graphs of order n having property P then P is determined by 
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the sequence (¥,);_,, and conversely, if ¥, is a family of graphs for order n then 
(F,);, determines a monotone increasing property P: a graph G of order n has 
P if and only if it contains at least one element of ¥,. Using the terminology of 
the previous section, a graph of order n fails to have property P if it contains no 
forbidden subgraph, i.e., no element of ¥,. 

A graph G is P-saturated or saturated for P if G does not have P but any graph 
obtained from P by the addition of an edge has P. In the first two sections we 
studied P-saturated graphs with maximal number of edges. Here we shall turn to 
the lower bound: at least how many edges does a P-saturated graph of order n 
have? Usually one writes sat(n; P) for this minimum, i.e., sat(a; P) = min{e(G): 
|G|=n and G is P-saturated}. Also, the set of extremal graphs is SAT(n; P) = 
{G: |G| =n, e(G) = sat(n; P) and G is P-saturated}. If P is given by the sequence 
(F¥,),-, then we may write sat(n,¥,) and SAT(n;%,) for sat(n;P) and 
SAT(n; P). Also, if %, = {F,,..., F,} then we may write sat(n; F,,...,F,) and 
SAT(n; F,,..., F,). 


3.1. Complete graphs 


Erdés et al. (1964) proved the following analogue of Turan’s theorem for 
saturated graphs. 


Theorem 3.1.1. If 2<r<n then sat(n; K,) = (r —2)(n-1) -("37) = (7 — 2)n — 
("3') and SAT(n; K,) = {K,_, + K,..,49}, i-e., the edge set of the unique extremal 
graph for sat(n; K,) is the set of all edges incident with a fixed set of r —2 vertices. 


Proof. Call a graph K,-saturated if it is saturated for the property of containing a 
K, subgtaph. Furthermore, writing k,(G) for the number of K, subgraphs of G, 
we call G strongly K,-saturated if k,(G)<k,(G" ) whenever G* is obtained from 
G by the addition of an edge. Clearly ever K,-saturated graph is strongly 
K,-saturated but a strongly K,-saturated graph need not be K,-saturated because 
it may contain a K,-subgraph. Note that if G is strongly K,-saturated then so is 
every graph obtained from G by the addition of some edges. 

The graph G, = K,_,+ K,_,,> has (r—2)n~("3?) edges and it is K,-satu- 
rated. Instead of the claim of the theorem, we shall prove the stronger assertion 
that every strongly K,-saturated graph of order n has at least (r —2)n —("35' 
edges, and G,, is the only strongly K,-saturated graph with n vertices and 
(r — 2)n — (731) edges. In fact, as the property of being strongly K,-saturated is a 
monotone increasing property, it suffices to prove the latter assertion. We shall do 
this by induction on n +r. 

The assertion is trivial if r=2 or n =r. Assume then that 3<r<~” and the 
result is true for smaller values of n + r. Let G be a strongly K,-saturated graph 
with n vertices and (r — 2)n — ("3') edges. Let x, and x, be non-adjacent vertices 
of G. As G is strongly K,-saturated, there are vertices x,,...,x, such that in the 
set {x,, X,,...,x,} any two vertices are joined to each other, with the exception 
of x, and x,. Let H = G/{x,,x,} be the graph obtained from G by identifying x, 
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and x,. Thus V(H) = {x,,x3,....%,}, for 3</<j<n two vertices x,, x; are 
joined in H if and only if they are joined in G, and x, is joined to x, in H if and 
only if at Jeast one of x, and x, joined to x,.in G. Clearly, 


e(G)2=e(H)+r—-2. 


Also, as G is strongly K,-saturated, so is H. Hence, by the induction hypothesis, 


eH) = (r—2y(n~1)-("5'), 


with equality if and only if H=G,_,. Therefore 
—1 
eG) = (r-2)n — @ 2 ) 


and if equality holds then H = G,_, and for i= 1 and 2 the vertices x,,...,x, are 
the only neighbours of x, in G. It is easily checked that this implies that G=G,,, 
as claimed. O 

Let us give another proof of the fact that every strongly K,-saturated graph of 
order has at least (r —2)n —("3') edges and so, in particular, 


wisnineam-(5')=0)-(059) 


Let G be a strongly K,-saturated graph with n vertices. Let A,, A,,..., A, be 
the (unordered) pairs of vertices not joined to each other. We have to prove that 
{<("~5*2). For each set A, there is an r-set C,; CV(G) such that A, C C, and the 
only two vertices of C, not joined to each other are the vertices of A;. Set 
B,=V(G) — C,. 

Note that [A,|=2, |B,|=n—r and A, B,=9. Furthermore, if i%j then 
A,B, #6. Indeed, if we had A,QB,=@ then the set C,=V(G)— B, would 
contain at least two pairs of non-adjacent vertices, namely A, and A,. Hence 
A, B,=9 if and only if é=j. Thus the required inequality is an immediate 
consequence of the following theorem of Bollobads (1965). 


Theorem 3.1.2. For two non-negative integers a and b write w(a,b) = (¢%")"'. Let 
{(A,, B,): ET} be a finite collection of finite sets such that A, B, =9 if and only 
if i=j. For iG] set a,=|A,| and b, =|B,|. Then 


> w(a,,6,) <1 
ied 
with equality if and only if there is a set Y and non-negative integers a and b, such 
that |Y|=a+ band {(A,, B,): i€ I} is the collection of all ordered pairs of disjoint 
subsets of Y with |A,|=a and |B,| = (and so B, = Y ~ A,). 
In particular, if a,=a and b,=b for all iG I then |i[<(¢%°). If a,=2 and 
b,=n—-r for all i€I then |I|<("75*?). 
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’ Proof. We shall prove the inequality; the case of equality requires a little more 
work. 

We may assume that the sets A,, B, are subsets of [n]. Call a permutation 
T =X,,X,...,xX, compatible with a set-pair (A,, B;) if in a every element of A, 
precedes cvcry clement of B,. Let N be the numbcr of compatibic pairs 
(a, (A;, B;)). Clearly each set-pair (A;, B,) is compatible with 


n 
é @: ) a,!b;\(n — a; — b,)! = n!w(a,, b;) 
permutations 7, so 


N+n!> w(a,,5;). 
fel 
On the other hand, no permutation a is compatible with two set-pairs, say 
(A,, B,) and (A,, B,). Indeed, otherwise we may assume that max{k: x, € A;} < 
max{k: x, € A,}. Then max{k: x, © A,} <max{k: x, € A;} <min{k: x, € B,} so 
A, B,=9, contradicting our assumption. Hence N <n!, so 


N=> w(a,, b,)n!<n!, 


tel 


implying the required inequality. O 


In fact, Theorem 3.1.2 is an extension of the LYM inequality of Lubell (1966) 
Yamamoto (1954) and Meshalkin (1963), which, in turn, is an extension of 
Sperner’s (1928) temma, and the proof given above is just a variant of Lubell’s 
proof of the LYM inequality. To be precise, the LYM inequality is simply the case 
B, = X — A, of Theorem 3.1.2 where X is the ground set. 

The original reason for proving Theorem 3.1.2 was to extend Theorem 3.1.1 to 
hypergraphs: with the appropriate definitions, every k-uniform hypergraph of 
order n which is saturated for a complete graph with r vertices has at least 
(i) — ("" 4" *) hyperedges. 

The proof of Theorem 3.1.2 can be adapted to give us the bipartite version of 
Theorem 3.1.1, first proved by Bollobas (1967a,b) and Wessel (1966, 1967). An m 
by n bipartite graph with classes V, and V, is strongly saturated for K(s, t) if the 
addition of any edge joining V, to V, creates at least one new complete bipartite 
subgraph with s vertices in V, and ¢ vertices in V,. 


Theorem 3.1.3. Let 2<s<m and 2<t<n. An m by n bipartite graph which is 
strongly saturated for K(s, t) has at least mn — (m—s +1)(n—1t+ 1) edges. There 
is only one extremal graph, the m by n bipartite graph containing all edges joining 
the two classes except those that join a fixed set of n—t +1 vertices in the first class 
to a fixed set of n—t +1 vertices in the second class. 


Duffus and Hanson (1986) studied refinements of the problem of determining 
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sat(n; K,). Let sat(n; K,,6) be the minimal number of edges in a K,-saturated 
graph with n vertices and minimal degree at least 6. 


Theorem 3.1.4. If n2=5_ then sat(n; K,,2)=2n-5 and if n2=10 then. 
sat(n; K,,3) = 3n —15. 


It is easily seen that for 6 = 2, 3 the value of sat(n; K,,6) is at most as large as 
claimed. Given a graph H and a vertex x of H, construct a graph G from H by 
adding to H a vertex and joining it to the neighbours of x. This graph G is said to 
have been obtained from H by duplicating the vertex x. Note that if H is 
K,-saturated then so is G. As the 5-cycle C, and the Petersen graph P are 
K,-saturated, so are the graphs with n vertices obtained from C, and P by 
repeated duplications of their vertices; these graphs have minimal degrees 2 and 
3, and 2n — 5 and 3n — 15 edges. 

Perhaps for every fixed 6 = 1 one has sat(n; K,,6) =6n — O(1). 


3.2. General families 


Let us turn to the problem of determining or estimating sat(n; ¥) for a general 
family ¥ of graphs. We know that if no member of ¥ is bipartite then ex(n; F) = 
[n?/4], i.e., there are (maximal) graphs of order n not containing any forbidden 
graphs which have at least [n’/4] edges. On the other hand, as the following easy 
estimate shows, sat(n; ¥) = O(w) for every fixed finite family ¥. 


Theorem 3.2.1. Let ¥ be a (non-empty) finite family of non-empty graphs and let 
r=max{|F|: FE ¥}. Then for n=r we have 


sat(n; F) <(r—2)n - " : 


Proof. Let us apply induction on r. For r=2 the assertion is trivial because 
K,€¥ so the empty graph K, is ¥-saturated. Suppose that r =3 and the result 
holds for smaller valucs of r. If ¥ contains a star K,,, s<r=1, then a graph 
containing no member of ¥ must have maximal degree at most s ~ 1 so 
sb na? « - ') 
: Ge 
sat(n; ¥)< i n<(r—2)n 2): 

Suppose then that no member of ¥ is a star. Sct ¥'={F~ {x}: FEF, 
x EV(F)}. Then ¥’ is a finite family of non-empty graphs, each with at most r — 1 
vertices, so by the induction hypothesis, 


sat(n — 1; ¥') <(r-3)(n — 1) - (‘ea ; 


Let H be an extremal graph for sat(m — 1; ¥') and let G be obtained from H by 
adding to it a vertex x and joining x to all n — 1 vertices in H. It is trivial that G is 
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-1 
sat(u; F) <e(G) = sat(n— 1; F') +n -1=(r—-2)n- . > ) : Oo 
Note that for # = {K,} the simple inequality above is, in fact, an equality. 
Kaszonyi and Tuza (1986) proved the following sharper upper bound for 
sat(u; ¥). 


Theorem 3.2.2. Let & be a family of non-empty graphs. Set 

u=min{|U|: FE %, UCV(F), F-U is a star} 
and 

y= min{e(F-U): FEF, UCVU(F), F-U is a star and |U| =u). 
Furthermore, let p be the minimal number of vertices in a graph F © ¥ for which 
the minimum s is attained. If n= p then 


a sol u(s + u) 
sat(n; #) <(u+ 5) \n - 7) : 


Proof. We proceed as in proof of Theorem 3.2.1 but this time we apply induction 
on u. It is again trivial to start the induction: if wu =0 then K,,U Ks ., EF, ice., 
¥ contains the union of a star with s edges and p — s — 1 isolated vertices. Hence, 
if an #-saturated graph G has an n = p vertices than its maximal degree is at most 
s—1 and so e(G) <(s — 1)n/2. The induction step is as before: the family ¥’ has 
parameters u ~1, s and p— 1 instead of u, s and p, so 


sat(n; ¥) <n 1+ (u +2) y-2 Dee) 


=(u+2*)n u(s + u) q 


In the proof above, the star K, , played a major role; In fact, as pointed out by 
Kaszonyi and Tuza (1986), it is very easy to find the exact value of sat(n; K, ,). 
Indeed, if G is K, ,-saturated, i.e., if G is a maximal graph with maximal degree 
at most s — 1, then any two vertices of degree less than s are joined. This remark 
and simple calculations imply the exact value of sat(n; K,,). 


Theorem 3.2.3. If s+1<n<s+s/2 then 


sat(n; K, ,)= C) a (" ae 


and if n>s+s/2 then 


sat(n; K, ,) = [(s — 1)n/2 — 87/8] 
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Note that in Theorem ‘3.2.2 one also has equality for # = {K,} since then 
u=r—s and s=1. Furthermore, if ¥ = {C,}, i-e., if the only forbidden graph is 
the 4-cycle, then u=1 and s=2 so, by Theorem 3.2.2, sat(n; C,) <3(n — 1)/2. 
This bound is also close to being best possible. To obtain a slightly better upper 
bound, given n2=5, take ¢= [(n —3)/2] triangles sharing a vertex, and join 
p=n-—2t—1 new vertices to p vertices of one of these triangles, using in- 
dependent edges. The obtained graph is C,-saturated and has n-vertices and 
3t + p = |(3n ~ 5)/2| edges. As proved by Ollman (1972), this is the best one can 
do. 


Theorem 3.2.4. If n =5 then sat(n; C,) = |(3n — 5)/2]. 


In conclusion, it is worth remarking that, as noted by Kaszonyi and Tuza 
(1986), the function sat(n; ¥) lacks the expected regularity properties. Namely, if 
#C&' and F'CF then we need not have any of the relations sat(n, #')< 
sat(n; F), sat(n; F’) <sat(n; F) and sat(n; ¥) <sat(n + 1; #). Indeed, let F be a 
K, and a K, sharing a vertex and let F’ = K,. The graph consisting a K, and n — 5 
edges incident with one of the vertices of the K, is F-saturatcd so sat(n; F) <n + 
5. On the other hand, sat(n, F’) =2n —3 so if n>8 then sat(n; F) <sat(n; F’). 
Also, with #’= {F’, F} and #={F} we have sat(n; ¥)<n +5 <sat(n; ¥') = 
2n —3 for n>8. 


3.3. Weakly saturated graphs 


Given a family ¥ of graphs and a graph G, write k,(G) for the number of 
subgraphs of G that are isomorphic to members of ¥. If #={K,} then, as 
before, we write k,(G) instead of ky (G). Call a graph G weakly ¥-saturated if 
there is a sequence of graphs Gy= GCG, C---CG,, such that V(G,) =V(G), 
e(G,) = e(G;_,) + 1 and k,(G,) > k,(G,_,) for every i, 1<i<m, and G,, is the 
complete graph on V(G). Thus G is weakly #-saturated if we can add to it edges 
one by one in such a way that with each edge we strictly increase the number of 
¥-subgraphs and we stop the process only when our graph is complete. Denote 
by w-sat(n; #) the minimal number of edges in a weakly #-saturated graph with 
n vertices. 

Since an ¥-saturated graph is also weakly ¥-saturated, we have w-sat(n; ¥) 
<sat(n; ¥). As once would expect, w-sat(n; #) can be much smaller than 
sat(n; #). For example, let F=kK,, i.e., let F be a set of k independent edges. 
Tutte’s (1947) 1-factor theorem implies easily that every maximal graph with n 
vertices without k independent edges is of the form K,+ U}., K,,..,, where 
s2=0, 2,20, q=n+s—2k+2 and D"., (2n,+1)=n—s (see Bollobés 1978a, 
Corollary 1.9, p. 58). This implies that if 2 2 (Sk ~ 2)/2 then the maximal number 
of edges in an F-saturated graph with n vertices is 


ex(n; kK,) = ( 2 ) 4 (k= nk + 1), 
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as proved by Erddés and Gallai (1961), and if n = 3k — 3 then the minimal number 
of edges in an F-saturated graph with n vertices is 


sat({n; KK,) = 3(k — 1), 


the minimum being given by & — 1 independent triangles. On the other hand, a 
graph with n=>2k +1 vertices and k—1 independent edges is weakly kK,- 
saturated so 


w-sat(n; kKK,)=k-1. 


Also, it is easily seen that for n=4 we have w-sat(n; C,)=n while Theorem 
3.2.4 tells us that sat(n; C,) = ((3n ~5)/2| for n=S. 

It is fascinating that for F = K, a weakly F-saturated graph must have at least as 
many edges as an F-saturated graph: w-sat(n; K,) = sat(n; K,) = (7 - 2)n —("3'). 
For very small values of r this is easily seen, For example, a weakly K,-saturated 
graph must be connected so w-sat(n; K,)=n-—1 and hence w-sat(n; K,)= 
sat(n; K,) =n ~— 1. However, while for sat(n, K,) there is just one extremal graph, 
the extremal graphs for w-sat(n; k,) are precisely the trees. The large size of the 
family of extremal graphs even in this trivial case indicates that it is considerably 
harder to determine w-sat(n; K,) than sat(n; K,). This task was accomplished 
almost twenty years after the original results of Erdés et al. (1964) and Bollobas 
(1965), by Frankl (1982), Kalai (1984) and Alon (1985). 


Theorem 3.3.1. [f 2<r<n then w-sat(n; K,) = (r ~2)n —("3'). 


To see what is needed to obtain this result, let us return to the proof of 
Theorem 3.1.1 that led us to Theorem 3.1.2. Let G be a weakly K,-saturated 
graph with n vertices and let G,=GCG,C---CG, be the sequence showing 
this. Let A, be the pair of vertices joined in G, but not in G,_,. Let C, be the 
vertex set of a K, contained in G, but not in G;_,, and let B, =V(G)— C;. Then 
\A,\ =2 and |B,)=n—r. As A;CC,, we have A, B,. Furthermore, ‘none of the 
pairs A,;,,, A;,2,--.,A, can be contained in C, since the vertices in A, were the 
last two vertices to be joined in C,. Hence for j >i we have A,B, #9. It turns 
out that these two conditions imply that ]=("~5*?) which is the content of 
Theorem 3.3.1. In fact, Frankl (1982), Kalai (1984) and Alon (1985) proved the 
appropriate result for all values of |A,|=a and |B,]=6, which implies the 
extension of Theorem 3.3.1 for uniform hypergraphs. 


Theorem 3.3.2. Let (A,,8,), (Az, B,),...,(A,;, 8,) be pairs of finite sets such 
that |A | =a, |B,| = b and A,B, =@ for all i. Suppose furthermore that AO B, - 
W if ij. Then l= (“1”). 


The proofs of Theorem 3.3.2, given by Frankl, Kalai and Alon are all rather 
similar, very beautiful and very unexpected: they make use of exterior powers of 


algebras. With hindsight: 
are clearly dimensions of “ 
Lovasz (1977) had used exte 
Theorem 3.3.2 is tailor-made for a: 
chapter 24 by Frankl. : 
The following extension of The 
Steckin (1982) and proved by Firedi (198 


Theorem 3.3.3. Let (A,,B,),...,(A,, B,) 
|B,| <b and |A,;B,|<c for all i. Suppose 
Then 1s (#*,2 7°). 


3.4. Hamilton cycles 


So far we have considered only the function sat(n; ¥), i.e., we have ca 
only the case when our forbidden family # does not depend on n. This s 


family 4, depending on n, namely ¥,={C,}. A graph with a vertices, is! 
C,,-saturated if it is a maximal non-Hamiltonian graph, i.e., if it is non-Hamilto- 
nian but the addition of any edge creates a Hamilton cycle. The following results 
were proved by Bondy (1972). 


Theorem 3.4.1. Let G be a maximal non-Hamiltonian graph of order n 27 with m 
vertices of degree 2. Then G has at least (3n + m)/2 edges. 


Corollary 3.4.2. [f n 27 then sat(n; C,) = [32/2]. 


When studying sat(n; ¥) for a fixed family J, it is usually easy to give an upper 
bound for sat(n; ¥) and the difficulty lies in proving that the function is at least as 
large as claimed. Rather curiously, the situation is quite different for sat(n; C,,): 
the results above are fairly simple, and, as it happens, the lower bound is the 
actual value of the function, but it is difficult to construct examples showing that 
sat(n; C,,) is indeed [3/2] if n is not too small. 

If a is even then sat(n; C,) = [3n/2] = 3n/2 if there is a cubic graph saturated 
for Hamilton cycles. Since a Hamiltonian cubic graph is 3-edge-colourable, we 
need a C,-saturated 4-edge-chromatic cubic graph of order n. In fact, 4-edge- 
chromatic cubic graphs are not easy to come by: Isaacs (1975) was the first to 
construct an infinite family of such graphs. By making use of this family, Clark 
and Entrmger (1983) and Clark ct al. (L988) proved that satis; C,) — [3n/2] for 
most values of n. 

In view of the difficulties with sat(n; C,), it is unlikely that one could determine 
even sat(n;C,) tor every paic (k,n). However, getting good bounds on this 
function may not be hopeless. 
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Corollary 4.1.4. Let G, and G, he graphs with n vertices such that if one has 
maximal degree n — \ then the other has an isolated vertex. If e@(G,) + e(G,) = 2n — 
3 then there is a packing of G, and Gy. 


Proof. If the maximal degrees are at most n—2 then the result follows from 
Theorem 4.1.3. Otherwise we may assume that G, has a vertex x of degree  — 1 
and G, has an isolated vertex y. Placing x on y, there remains to pack G| = G, — 
x, with e(G,)—n+ 1 edges, and G, = G,— y, with e(G,) edges. Since e(G\) + 
e(G,) <n — 2, it is trivial that there is such a packing, for example, by Theorem 
4.44. 0 


This corollary implies immediately the result of Sauer and Spencer mentioned 
above: if e(G,) + e(G,) <3(n — 1)/2 then there is a packing of G, and G,. Teo 
(see Yap 1988) extended Theorem 4.1.3 to graphs having a total of 21 — 2 edges. 
As expected, the number of exceptional pairs increases substantially. For 
simplicity, we state the result only for n = 13. 


Theorem 4.1.5. Let G, and G, be graphs with n=13 vertices each such that 
A(G,)<n-—2 and e(G,) + e(G,)<2n —2. For i=1, 2, 3, let H, be the disjoint 
union of a star with n~i- 1 edges and a K,;: H,;=K,,,_;.,UK,, let H, be a 
disjoint union of cycles, i.e., a 2-regular graph of order n, and for n = 3k let H, be 
the disjoint union of k triangles: H,=kK,=T,(n). If {G,,G,} is not one of the 
pairs {H,,H,}. (H,, Hy} and (H,, H,} then there is a packing of G, and G,. 


If one of the graphs to be packed is a tree then one can do considerably better. 
Extending various earlier results, Slater et al. (1985), proved that if T is a tree of 
order n, G is a graph of order n and size n — 1, and neither T nor G is a star then 
there is a packing of T and G. Furthermore, by making use of Theorem 4.1.3 and 
this result, Teo and Yap (1987) characterized the graphs of order 7 and size n 
which can be packed into the complement of any tree of order n. 

It is very likely that, in turn, Theorem 4.1.5 can be extended to graphs with a 
total of 2n — 1 edges at the expense of a further increase in the set of exceptional 
pairs but the proof is likely to be forbiddingly cumbersome. However, for the case 
when the maximal degree is restricted even more, Eldridge (1976) proved the 
following result. The bound cannot be improved in general. 


Theorem 4.1.6. Let r= 4 and let G, and G, be graphs with n= 9r°”” vertices and 
maximal degrees at most n-r. If e(G,) + e(G,) <2n + r(Vi — 2) — VF then there 
is a packing of G, and G. 


Rather little is known about packing many graphs with few edges. In particular, 
if true, the following conjecture of Bollobas and Eldridge (1978) is unlikely to be 


easy to prove. 


Conjecture 4.1.7. For every k 21 there is an n(k) such that if n =n(k) and G,, 
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G,,...,G, are graphs with w vertices such that e(G,) <n —i and A(G,)<n~k 
for every i, i= 1, 2,...,k, then there is a packing of G,, G,,..., Gy. 


4.2. Graphs of small maximal degree 


The results above show that a trivial obstruction to packing is the existence of 
vertices of very large degrees. If the maximal degrees are known to be small then 
the existence of a packing follows from much weaker bounds on the total number 
of edges. Now we shall look for restrictions on the maximal degree only implying 
the existence of a matching. 

The following simple result was announced by Catlin (1974) and proved 
independently by Sauer and Spencer (1978). 


Theorem 4.2.1. Let G, and G, be graphs with n vertices such that A(G,)A(G,) < 
n/2. Then there is a packing of G, and G,. 


Proof. As for A(G,)A4(G,) <1 there is nothing to prove, we may assume that 
MG, )A(G,) 22 and so n 25. Choose an identification of the vertex sets V(G,) 
and V(G,) in which G, and G, have a minimal number of edges in common. 
Suppose V(G,) = {x,,...,x,}, ViG,)={y1,..-, y,} and x, is identified with y,. 

Assume that, contrary to the assertion, G, and G, share an edge in this 
identification, say x,x,€@ E(G,) and y,y,€& E(G,). Let L be the set of indices / 
such that either x,x,E /(G,) and y,y, € (G,) or clse y,y, € E(G,) and x,x, € 
E(G,). Since x,x,€ E(G,) and y,y, € E(G,), we have 


|L| < (A(G,) — 1I)A(G,) + (AG) ~— IAG.) <n ~ 2. 


Hence there is a natural number k, 3k <n, such that k & L. If we flip x, and 
xX,, Le., if we identify x, with y, and x, with y,, then the number of edges 
common to G, and G, decreases, contradicting our assumption. 


How far is this result from being best possible? Let d,<d,<n be natural 
numbers such that a<(d, + 1)(d, + 1) - 2. Let G, be a graph such that d, of its 
components are complete graphs of order d, +1; similarly, let G, have d, 
components that are complete graphs of order d, +1. For example, let G, = 
d,K,.,,UK,_, and G,=d,K,,,UKg,_,. Note that A(G,)=d, and A(G,) = 
d,. Suppose that there is a packing of G, and G,. Then every K, ,, component of 
G, has at least one vertex outside the K,,,, components of G,. As there are d, 
components of the form K, ,, in G, but only d, — 1 vertices of G, in the Ky, ,, 
components, this is impossible. Hence there is no packing of G, and G,. 

Bollobds, Eldridge and Catlin conjectured (sce Bollobas 1978b) that the 
example above is worst possible, i.e., 2/2 in Theorem 4.2.1 can almost be 
replaced by a. 


Conjecture 4.2.2. Let G, and G, be graphs with m vertices such that (A(G,) + 
1)(A(G,) + 1) <n +1. Then there is a packing of G, and G,. 
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At the moment we are very far from a proof of the above conjecture. The 
following difficult theorem of Hajnal and Szemerédi (1970) provides some 
evidence for the truth of the conjecture. 


Theorem 4.2.3. Every graph with maximal degree A has a (4+ 1)-colouring in 
which the cardinalities of any two colour classes differ by at most 1. 


Note that the Hajnal-Szemerédi theorem implies Conjecture 4.2.2. in the case 
when G, is of the form 7,(#); in fact, the theorem is more or less equivalent to 
the conjecture in this case. Indeed, if G,=T7.(n) then A(G,) = [n/r]| —1 so if 
(A(G,) + I)(A(G,) + Isat then A(G,)+ 1 = [n/r] <(n + 1)/(AGG,) + 1). 
Therefore r= A(G,) + 1 so Theorem 4.2.3 implies that there is a packing of G, 
and G,. 

One should emphasize that Theorem 4.2.3 itself is a substantial result, various 
special cases of the theorem had been proved earlier by Dirac (1952), Corradi 
and Hajnal (1963), Zelinka (1966), Griinbaum (1968) and Sumner (1969). 

Catlin (1977, 1980) proved some special cases of Conjecture 4.2.2, including 
the following result. 


Theorem 4.2.4. There is a function f(n) = O(n?) such that if G, and G, are 
graphs with n vertices such that A(G,) <2 and A(G,) <n/3 — f(n), then there is a 
packing of G, and G,. 


4.3. Packing trees 


Very little is known about the possibility of packing more than two graphs. The 
only exception is the case when all the graphs to be packed are trees. In fact A. 
Gyarfas made the following beautiful conjecture (see Gyarfas and Lehel 1978). 


Conjecture 4.3.1. Any sequence of trees T,, T,,..., T,, with T; having i vertices, 
can be packed into K,,. 


Note that the total number of edges of T,, T,,..., T,, is ey i=(3)soina 
packing claimed by the conjecture every edge of K, must belong to precisely one 
of the trees. 

This conjecture, which has come to be known as the tree packing conjecture, is 
unlikely to be solved in the affirmative in the near future. At the moment the 
truth of the conjecture is known only in some very special cases. Here we shall 
give three examples: the first two are due to Gydrfas and Lehel (1978) and the 
third to Hobbs (1981). Recall that a star is a tree of the form K, ,,, 1.e., a tree of 
diameter 2. 


Theorem 4.3.2. Let T,, T;,...,7,, be trees with T,; having i vertices, such that 
each T, is a path or a star. Then there is a packing of T,, T,,...,T,, into K,,. 
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Theorem 4.3.3. Let T,, T,,...,7,, be trees with T, having i vertices, such that all 
but at most two of them are stars. Then there is a packing of T,, T,,...,T,, into 
K,,. ; : 

Theorem 4.3.4. Let T,, T,,...,7,, be trees of diameter at most 3 such that T, has i 
vertices. Then there is a packing of T,, T,,...,T,, into K,. 


The first two results were extended by Straight (1979). In particular, extending 
Theorem 4.3.3, he proved the existence of a packing if A(7;) =i — 2 with at most 
two exceptions. (Note that 7; is a star if A(7,)=i-— 1.) Furthermore, Straight 
(1979) verified the tree packing conjecture for n <7, and Fishburn (1983) proved 
it for n <9. Theorem 4.3.4 was also considerably extended by Fishburn (1983). 
These results indicate that even a disproof of Conjecture 4.3.1 is likely to be 
difficult. 

Packing a family (7,)} of trees of arbitrary shapes is fairly casy if k is not too 
large. The following easy result of Bollobas (1983) shows that here we can take 
k =|cn] for some c>0. 


Theorem 4.3.5. Let (7,)5 be a sequence of trees where k = |V2n/2] and T;, has i 
vertices. Then T,, T,,..., 7, can be packed into K,,. 


In fact this result has very little to do with packing, because under the 
conditions the trees can be packed into K,, one after the other: first we pack T,, 
then T,.,, then T,_,, etc.; when we choose a packing of 7, we do not take into 


i 


account the trees 7;_,;, T;_,,..., T,. A packing of 7, exists because the graph 
into which T, is packed has fairly many edges. In fact, the bound [V2n/2] could 
be replaced by |V3n/2] if one could prove the following fascinating conjecture 
proposed by Erdés and Sés in 1963. As it happens, this conjecture was one of the 


motivations for the conjecture of Gydarfas. 


Conjecture 4.3.6. Every graph with n vertices and more than (k — 1)m/2 edges 
contains every tree with k cdges. 


Note that the number of edges is just sufficient to guarantee that the graph 
contains a path with & edges and a star with k edges. 

Rather than strengthen Theorem 4.3.5, perhaps one could prove the following 
conjecture which is considerably weaker than the tree packing conjecture. 


Conjecture 4.3.7. For every k = 1 there is an n(k) such that if 2 =n(k) and T,,_,, 
T,-k+10+++) 1, are trees, with 7, having ¢ vertices, then they can be packed into 


4.4. Packing bipartite graphs 


In this section, we shall prove an attractive result of Hajnal and Szegedy about a 
special type of packing of bipartite graphs. 
Let G, and G, be n by m bipartite graphs, with bipartitions (U,, W,) and (U,, 
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W,). We say that there is a bipartite packing or simply packing of G, and G, if the 
n by m complete bipartite graph K(n, m), with bipartition (U,W) contains 
edge-disjoint subgraphs H, and H, such that, for i=1, 2, the graph H, is 
isomorphic to G,, with U, corresponding to U. (Note that, unless n =m and G, 
and G, are rather sparse, a bipartite packing is just a packing of G, and G, as 
bipartite graphs, i.e., into K(n,m). This justifies the abbreviated terminology.) 
Equivalently, G, and G, have a packing if there are one-to-one maps f: U,— U, 
and g: W,-—> W, such that if xy is an edge of G,, with y & W,, then f(x)g(y) is not 
an edge of G,. We shall call the pair (f, g) a packing of G, and G,. 

In the proof of the theorem below, we shall need the following simple 


consequence of Hall’s theorem (see chapter 3) about matchings in n by n bipartite 
graphs. 


Lemma 4.4.1. /f the minimal degree of G is at least n/2 then G has a matching. 


To keep the notation we need self-explanatory and manageable, for i= 1, 2, we 
denote by d(U;) the average of the degrees of the vertices of G, belonging to U,, 
and by A(U,) the maximum of these degrees. Define d(W,) and A(W,) analo- 
gously. We are ready to state and prove the promised result of Hajnal and 
Szegedy (1992). 


Theorem 4.4.2. Suppose that the n by m bipartite graphs G,, G, with bipartition 
(U,, W,), (U,, W,), are such that 


60 =< A(W,) < m/20d(U,) , 

60 < A(W,) < m/20d(U,) , 
and, for i=1, 2, 

A(U;) < m/2 log(4m) ,° 


then there is a bipartite packing of G, and G,. 


Proof, Let f: U,— U, be a one-to-one map. As we shall sce in a moment, there 
is a One-to-one map g: W,— W, such that (f, g) is a packing of G, and G, if and 
only if a certain m by n bipartite graph B, has a matching. 

Indeed, define a bipartite graph B, with bipartition (W,, W,) by making y, y2 
(y, EW, y2 © W,) an edge of B, if g(y,) = y, does not violate the condition that 
if xy € E(G,) then f(x)g(y) € E(G,). In other words, let y, y, € E(B,) if and only 
if (TO) ATO.) =9, ie., if yy ETUO,))), where T(x) denotes the set of 
neighbours of a vertex x in the appropriate graph. 

In view of Lemma 4.4.1, the theorem follows if we show that for some map f 
the minimal degree 5(B,) of B, is at least m/2. Hence it suffices to show that the 
probability that 6(B,) 2 m/2 for a random map f is strictly positive. In turn, it 


Legit 
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suffices to show that the probability, that the degree of a particular vertex of B, is 
less than m/2, is less than 1/2m. Our aim is then to prove this. 

By symmetry it suffices to consider a fixed vertex y, € W,. For simplicity, let 
U, =[n] = {1, 2,..., a} and let d; be the degree of vertex ¢ in G,. Then 


d,(y)=m—fy lzm—- Dd d,. 
fEf(ECY 1) 


Hence, if d(y,)={|I'(y,)|=r, i-e., y, has r neighbours in G,, then 
m m 
P(t, <5) <P(d4>4), a) 
ter 
where P, denotes the probability taken in [n)”, the space of all r-subsets of {1, 
2,..., a}, and 7 is a random clement of {n|"”. 


With the monotone increasing set system # ={AEP(n): Vie, d,>n/2}, 
inequality (1) becomes 


m 
Pla, (9) <S) <P,64). 2) 
Setting p=5A(W,)/4n, we see that with g=1—p we have pqn23 and 


r<pn~ (3pqn)'’?. A martingale-type inequality implies that, under these con- 
ditions, 


1 
p(t) =(1-2)P(a) 21 P(e), 3) 
where P, (2) is the binomial probability with probability p: 
P,(@)= > pig '!, 
Acad 


Furthermore, by a standard estimate of the probability in the tail of the binomial 
distribution, 


1 

P (#4) rome 

Combining this with (1), (2) and (3), we find that 
m 1 

(dy <F) < aa 

as desired. 0 
The conditions in Theorem 4.4.2 are fairly tight: there are many ways of 

showing this with the aid of random graphs, but we do not go into the details. 


Note also that in the theorem we proved more than we claimed: for every 
f:U,—U, there is a g: W,—> W, such that (f, g) is a bipartite packing. 
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4.5. The complexity of graph properties 


The complexity c(P) of a graph property # is the minimal number of entries in 
the adjacency matrix of a graph that must be examined in the worst case in order 
to decide whether the graph has the property or not. It is convenient to spell out 
this definition in terms of a game ¥ between two players, called the Constructor 
and Algy (or Hider and Seeker). Denote by 9" the set of all graphs with a fixed 
set V of n vertices, say V= {1, 2,...,n}. Then a property ? of graphs on V is a 
subset of $” such that G © # whenever a graph isomorphic to G belongs to ¥. In 
the game # Algy asks questions from the Constructor about a graph G on V. 
Each question is of the form: “Is ab an cdge of G?”, and each question is 
answered by the Constructor. When posing a question, Algy takes into account all 
the information he has received up to that point. The Constructor need not have 
any particular graph in mind: he may change his choice of graph he is constructing 
edge by edge according to the questions asked by Algy. The game is over when 
Algy can decide whether or not the graph the Constructor has been defining will 
have property Y or not. The aim of the Constructor is to keep Algy guessing for 
as long as possible. On the other hand, Algy tries to pose as pertinent questions 
as possible: he would like to decide as soon as possible whether the graph has ? 
or not. The number of moves of Algy (i.e., the number of questions) in this 
game, assuming that both players play optimally, is the complexity c(P) of the 
game . 

Needless to say, the complexity of a digraph property is defined analogously. 
Moreover, the definition easily carries over to properties of subsets. Given a finite 
set X, a set system ¥ on X, i.e., a subset -¥ of the power set P(X), is said to be a 
property of the subsets of X. Thus a subset of X has property ¥ if it belongs to ¥. 
Algy’s questions are of the form: “Is x an element of our subset #?”. 

Note that a property of graphs on V is precisely a property of the subsets of 
V™, the set of all unordered pairs of elements of V, which is invariant under the 
permutations (of V°’ induced by the permutations) of V. 

A property ¥ C P(X) is trivial if either ¥=0 or ¥= A(X); needless to say, 
one is not interested in trivial properties. As shown <e Bollobas and Eldridge 
(1978), Theorem 4.1.3 concerning the packings of graphs implies a lower bound 
on the complexity of a non-trivial property of graphs. 


Theorem 4.5.1. The complexity of a non-trivial property of graphs of order n is at 
least 2n — 4 


The bound given in this theorem is unlikely to be best possible although, as the 
following example due to Best et al. (1974) shows, it does give the correct order 
of magnitude. A scorpion graph with n vertices is a graph containing a path bmt 
such that b (the body vertex) has degree n — 2, m has degree 2 and ¢ (the tail 
vertex) has degree 1. Note that the graph spanned by the m — 3 neighbours of b 
different from m is entirely arbitrary. 


Theorem 4.5.2. The graph property of containing a scorpion graph has complexity 
at most 6n. 
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For lack of space, in the rest of the section we shall concentrate on elusive 
properties. A property ¥ of the subsets of X is elusive if c(¥) = |X], i-e., if every 
element of X must be examined in order to decide whether a subset of X belongs 
to ¥ or not. Thus a property ¥ of graphs of order n is elusive if c(P) =(3) anda 
property 2 of digraphs of order n (containing at most one loop at each vertex) is 
clusive if c(2) =n’. Best et al. (1974), Kirkpatrick (1974), Milner and Welsh 
(1976), Bollobas (1976b) and Yap (1986) have shown that a good many properties 
of graphs with n vertices are elusive. These properties include the property of 
being planar (for n=5), the property of containing a complete graph with r 
vertices (for 2<r<an), the property of having chromatic number & (for 2<k < 
n), the property of being 2-connected, the property of being connected and 
Eulerian, and the property of being connected and containing a vertex of degree 
bs 

A property ¥ of the subsets of a set X is monotone increasing if AG ¥ and 
ACBCX imply that BE ¥; a monotone decreasing property is defined similarly. 
A property is monotone if it is cither monotone increasing or monotone 
decreasing. After some initial difficulties, Aanderaa, Rosenberg, Lipton and 
Snyder (see Rosenberg 1973 and Lipton and Snyder 1974) advanced the 
conjecture that every non-trivial monotone property of graphs is close to being 
elusive in the sense that c(?) = en’ for some constant ¢ >0. A little later, Best et 
al. (1974) advanced a sharper form of this conjecture: every non-trivial monotone 
graph property is elusive. The weaker form of the conjecture was proved by 
Rivest and Vuillemin (1976). 


Theorem 4.5.3. If # is a non-trivial property of graphs of order n then (P)= 
2 
n’/16. 


In fact, Rivest and Vuillemin deduced this result from a theorem claiming that 
certain set properties are clusive. Given a property ¥ of subsets of X (i.e., a set 
system ¥ C A(X), let Aut(#) be the group of automorphisms of ¥, i.e., the 
group of permutations of XY leaving ¥ invariant: Aut(¥) = {2: @ is a permuta- 
tion of X such that if AG F then m(A)E F¥}. 


Theorem 4.5.4. Let X be a set with p’ elements, where p is a prime, and let # bea 


property of subsets of X. If Aut(¥) is transitive on X, 0€ F and X € F then F is 
elusive. 


Encouraged by this beautiful result, Rivest and Vuillemin conjectured that 
Theorem 4.5.4 was true without any restriction on the number of elements of X. 
This conjecture has turned out to be false: a counterexample was given by IIlies 
(1978). However, Kahn et al. (1984) proved the exact form of the Best et al. 
conjecture for prime power values of 1. 


Theorem 4.5.5. Let n =p" where p is a prime. Then every non-trivial monotone 
property of graphs with n vertices is elusive. 
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Kahn et al. used techniques from algebraic topology to prove their beautiful 
theorem. The crucial step in the proof is that if ¥ is a non-elusive monotone 


decreasing property of subsets of X then the abstract simplicial complex of X 
formed by the elements of ¥ is collapsible. 


The bound n°/16 in Theorem 4.5.3 was improved by Kleitman and Kwiatkow- 


ski (1980) to n°/9; Kahn et al. used their theorem to give the even better lower 
bound n7/4 + o(n’). 


Let us close with a fascinating conjecture of Kahn et al. (1984) claiming that 
the analogue of the Best et al. conjecture holds for properties of subsets. 


Conjecture 4.5.6. Let # be a non-trivial monotone property of subsets of X. lf 
Aut(#) is transitive on X then ¥ is elusive. 
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Extremal set systems 1295 
1. Introduction 


Let X be an n-element set and ¥ C 2” a family of distinct subsets of X. Suppose 
that the members of ¥ satisfy some given conditions. What is the maximum 
(minimum) value of |¥|? This is the generic problem in extremal set theory and 
we shall try to give an overview of the existing results and methods. Here is the 
simplest result: 


Theorem 1.1. [f FOF’ 4@ holds for all F, F'€ ¥ C2, then |F{ <2" '. 


Proof. For each AC X either A or X\A (or both) are absent from ¥. Thus 
[F]<42"=2"". 0 


2. Basic definitions and conventions 


For s,¢ positive integers, s22, a family # is called s-wise tintersecting if 
[F, +++ AEJ Zt holds for all F,,...,F,E ¥. lf t= 1, then ¢ is omitted. Also if 
s = 2, then s-wise is omitted. Thus, “intersecting” means 2-wise !-intersecting. 

A family ¥ is called k-uniform or a k-graph if |F| =k for all FEF. 

The size of a family ¥ is |{#| and it is often denoted simply by m. The members 
of ¥ are also called edges. Let (*) denote the family of all k-element subsets of 
X. 

For ¥C2*, set FV ={(FEF: |Fl=i}, and f=|¥}. In this case f= 
(fo.---.f,) is called the f-vector of F. 

Let [n] denote {1,..., 7}, [i, jJ= (Ul: isi <j}. Usually we suppose X = [7]. 

For iG X, define Fi) ={F\i}: (© FEF}, the link of i, F(I)D={FEF: 
i F}. 

The degree d,(i) is simply |F()|; 6(F) and A(#) denote the minimum and 
maximum degree, respectively. 

#F = {X\F: F E F} is the complementary family of ¥. 

F C2 is called hereditary if ECF © & implies E € ¥. (Note that §E ¥.) 

¥ C2* is called a filter if ¥* is hereditary. 

The /th shadow of(¥) of a family # is defined by: 


o(F)={Ge(7): are #,ccr}. 


HF) ={GCX: GEF, AF EF, |G AF| =1} is called the boundary of F. 

vu(¥), the matching number of F, is the maximum number of pairwise disjoint 
edges in ¥; oF) =~ if PEF. 

(F), the covering number of ¥, is the minimum cardinality of a set T with 
TOF A® for all FEY; (4%) =~ HES. 

& is called v-critical if v(G) > v(F) holds for every family obtained from ¥ by 
replacing one of its edges by a proper subset of it. 

‘% is called t-critical if 0G) <7CF) for all GCF, 
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F is called an antichain if FZ F' holds for all F, F’ € F. 

Define the reverse lexicographic order <, on 2* by A<,B if ACB or 
max{x € A\B} < max{x € B\A}. 

Let £(m,k) (R(m, k)) be the largest (smallest) m members of (1%!) in the 
reverse lexicographic order. 

Note that #((j), &) = (4). 

We call ¥, & cross-intersecting if FEF and GEG implies FOG ¥ 96. 

# is called a sunflower of size m and with center C if FN F' = C for all distinct 
F, F' EF and |F| =m. 

# is said to be intersection-closed if F, F'€ ¥ implies FOF’ € F. 

We close this section with a conjecture of Frankl (1979). 


Conjecture 2.1. If ¥ is intersection-closed, |¥|=2, then &(F) <|¥|/2 holds. 


3. Basic theorems 
The oldest result in extremal set theory is Sperner’s Theorem. 


Theorem 3.1 (Sperner 1928). If # C2* is an antichain, then |F|<(,,",,) with 
equality if and only if F = (,%5,) or F = (4%) holds. 


Recent research on antichains belongs to the theory of partially ordered sets. 
We refer the reader to chapter 8 or the book by Engel and Gronau (1985). 

The maximum size of intersecting k-graphs was determined in 1938 by Erdés, 
Ko and Rado although they did not publish their result until much later. 


Theorem 3.2 (Erd6s et al. 1961). If #C(%) is tintersecting, k>t=1, n= 
nv(k, 1), then \Fl< (nz! . 


From the work of Frankl (1978) and Wilson (1984) we know that the 
conclusion holds if and only if 1 = (kK -£4+ 1+ 1). 
Another classical result is due to Erdés and Rado (1960). 


Theorem 3.3. If ¥ C(%), |[F|>k'"r — 1)*, then F contains a sunflower of size r. 


Erdés (1981) offers $1000 for a proof that the same holds for |¥]| > c(r)“, where 
c(r) is an appropriate constant. 

Probably the single most important result in finite set theory is the Kruskal— 
Katona Theorem, which was proved by Kruskal (1963) and Katona (1966) [see 
also Lindstr6m (1967), where a somewhat weaker statement is proved}. 


Theorem 3.4. /f FC (*) is a family of size m, then for all l<k, |o(F) = 
aR(m, k)). 


Extremal set systems 


Evaluating |o,(A(n, k))| one can get explicit bounds, which, however, are 


unsuitable for computations. The irregular behaviour of the Kruskal—Katona~~~ 


function ts explained in Frankl et al. (1995c). Lovasz (1979) gives the following 
weaker but more convenient version. 


Theorem 3.5. Let ¥ C(*), |¥| =m, and define the real number x = k by m= (4). 
Then |o(¥)| 2G) holds for all 1<k. 


A simple common proof of Theorems 3.4 and 3.5 was given by Frank! (1984). 
The vatues of m and k for which 2(m, k) is the unique optimal family in Theorem 
3.4 were determined independently by Ftiredi and Griggs (1986) and Moérs 
(1985). 

Hilton (1976) noticed that the Kruskal~Katona Theorem can be restated in the 
following form. 


Theorem 3.6. /f # C (¥) and GC (4) are cross-intersecting, then so are £(|F|, k) 
and £(\}, 1). 


Theorem 3.7 (Matsumoto and Tokushige 1989). If # C(t) and C(4) are 
cross-intersecting and n= 2k = 21, then |F|\$|<(@z})C72}). 
Another important theorem on shadows is due to Katona (1964). 


Theorem 3.8. If # C ({) is t-intersecting, then for all k~t<1<k one has 


jaw yiial= (A) /PE Yen. 


Katona used this theorem to determine the maximum size of t-intersecting 
families # C 2*, which we will discuss in section 5. Katona showed also that the 
case t=1 of the Erdés-Ko—Rado Theorem 3.2 is an easy consequence of 
Theorem 3.8. 

The discrete isoperimetric problem can be stated as follows: given m, determine 
min{|a¥|: ¥ C2*, |[F¥| =m}. 

A ball with center A and radius r is the family @(A, r) = {B CX: |A AB| <r}. 

If BA, r)CF¥CAA,r+1), then & is called a generalized ball. Harper 
(1966) shows that generalized balls have minimum boundary. 


Theorem 3.9. For every ¥ C 2* there exists a generalized ball GC 2* of the same 
size with |a(F)| 2 la()I. 


A short proof of this result was given by Frankl and Furedi (1981). 
For ¥ C(*) one defines its k-boundary x(¥) by: 


K(F) = \Ge (4): GEF,AF EF, |G AF\= 2} 
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One of the outstanding open problems is the isoperimetric problem for ({). 
Open Problem 3.10. Given m, determine min{|«(F)|: F¥ C(t), |F| =m}. 
The next result is due to Kleitman (1966a). 


Theorem 3.11. Let €, D C2* be hereditary. Then 
I€ND|z|e[{I]/2”. 
Proof. Apply induction on n, the case n=0 being trivial. Set c, =|@(n)I, 
c,=|¢(@)], dy = |P@(A), and d, =|B(n)|. Then 
16 AD| =[C(n)N B(n)| + |€(a) N B(n)| 
= (c,d, + cydy)/2" | (by induction) 
= (Cg + €,)(dy + d,)/2" + (Cy ~— c,)(dy —d,)/2” . 
Using €(n) € €(n) and A(n) C D(A), (cy — €, (dy — d,) 20, which completes 
the proof. QO 


By now there are many generalizations of Theorem 3.11, some of which are 
discussed in chapter 8. 


4. Basic tools 


The most uscful tool for investigating s-wise f-intersecting families is an operation 
called shifting, which was introduced by Erdés et al. (1961). 


Definition 4.1. For ¥ C2* and 1<i<j <n, define the (i, j)-shift 5S, by S,(F) = 
{S,(F): FEF}, where 


5 y= {ula =F if jE Fig F and F¢F, 
ve otherwise . 


Some of the useful properties of the (i, j)-shift are summarized by the next 
lemma. 


Lemma 4.2. 
(i) |F| = \S,(F)| and |F\ = IS,(F I; 
(it) HS, )(F)) CS, (oF )); 
(iii) if F is s-wise t-intersecting, then so is S,(F); 
(iv) o(S;(¥)) <o(F). 


Iterating the (/, j)-shift for all |<i<j<n will eventually produce a family 
which is invariant with respect to the (i, /)-shift. 
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Definition 4.3. We call & stable if S,(@) = for all 1 <i<j<n. The followin 
result is straightforward to show. 


Proposition 4.4. is stable if and only if for all GEG, 1<i<j <n, with j € ( 
iF G, (G\Uj} U (i}) is also in &. 


A variation of the (i, j)-shift, called down-shift was defined by Kleitma 
(1966). : 


Definition 4.5. For @C2* and i€X, define the down-shift D,; by D(G) 
{D(G): GE}, where 


ifiE€GE and (G-{H))¢<G, 
otherwise . 


_{G-4) 
pia) = {6 
Define the trace ¥ Fl, ={FNY: FEF}. 


Some important properties of the down-shift are summarized in the nex 
lemma; property (ii) is due to Kleitman (1966), and (iii) to Frankl (1983). 


Lemma 4.6. 
(i) |D,()| =19|; 
(ii) if |F AF’|<d holds for all F, F’ © F, then the same holds for DAF); 
(iii) |D(F)|y| <= |Gly| for all iE X and YCX. 


Iterating the down-shift again produces an invariant family. 
Proposition 4.7. D9) = & holds for all i€ X if and only if G is hereditary. 


Let us use this proposition to give a simple proof of the following result whic! 
was discovered independently by three sets of authors: Sauer; Shelah and Perles 
and Vapnik and Chervonenkis. 


Theorem 4.8. If |F|>Uy-,-, ("), then there is some RE(*) with F|_=2*. 


Proof. Suppose that |¥|,| <2’ for all RE (*). In view of Lemma 4.6 (iii) we ma\ 
apply the down-shift to #, and by Proposition 4.7 obtain a complex &, stil 
satisfying |$|,|<2" for all RE Cs! However, since @ is hereditary, this implie: 
|G| <r for all GE G, whence |G| < )1,.,_-, (”) follows. C 


We point out that the largest r such that there exists a set R € (*) with ¥|, =2' 
is called the Vapnik~Chervonenkis dimension of ¥. This concept has foun 
interesting applications in combinatorial and computational geometry, anc 
learnability theory (e.g., see Blumer et al. 1989, Clarkson et al. 1988, and Linia 
et al. 1991). 
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Another important tool for investigating families of finite sets is the inclusion 
matrices. 


Definition 4.9. For ¥C2*, the |o(¥)| by |F| matrix M(j, ¥) has its rows 
indexed by G € o,(F) and its columns by F & ¥, and its general entry is 


Aaa c(l eer. 
MNO E Vg AG e F. 


Simple computation gives the next result. 


Proposition 4.10. (i) M(j, ¥)’M(j, F) is an |F| by |F| matrix with general entry 


’ 


n(F, F') = ary 


(ii) M(j, F¥)'M(j, F‘) is an |F| by |F| matrix with general entry 
ee 


n(F, F)=( j 


Definition 4.11. FC (%) is called k-partite if there exists a partition X = X,U--- 
UX, with |FNX,|=1 for all FE ¥, 1<i<k. 


A simple but useful result of Erdés and Kleitman (1968) is the following 
lemma. 


Lemma 4.12. Every k-graph ¥ contains a k-partite k-graph G with |G|/|#F| = kiV/ 
ae 


Definition 4.13. For a k-partite ¥C(*) and FEF, define I(F, F) = {M(FN 
F'): FA F'€ F}, where MA) = (i: AN. X, GO}. (Thus HF, ¥)C2"1) 


Definition 4.14. We call ¥ C2” r-complete if for all distinct F, F’ © ¥ there is a 
sunflower of size r and with center FO F’ formed by members of ¥. 


Firedi (1983) discovered the following Jemma which has since proved very 
useful. 


Lemma 4.15. There exists a positive constant c = c(k,l) such that every ¥ C(%) 
has a k-partite subfamily F* satisfying 
(i) |F*"|=clF |; 
(ii) ¥* is k-partite with I(¥*) = II(F, ¥*) being the same for all F © ¥*; 
(iii) 4#* is l-complete. 


. 
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Proposition 4.16 (Deza). If 1 >k in Lemma 4.15, then II(#*) is intersection- 
closed. : 


Proof. Take D’, DE II(¥*), and choose F, F’, F°€ ¥* with D’ =I FOF’), 
D"=H(FOF"). Let G,,...,G,,, and H,,...,H,,, be members of #* 
forming sunflowers with centers C'= FM F' and C"= FMF", respectively. The 
sets G,\C’, G,\C’,...,G,,,\C’ are pairwise disjoint; thus one of them, say 
G,\C", is disjoint from C”. Similarly, H,\C",...,H,.,\C" are pairwise disjoint, 
implying that one of them, say H,\C", is disjoint from G,. Now G,N H, =C’'N 
Cc", implying 1(C’ NC") = D'N D” E I(F*) (in the last step we used that #* is 
k-partite). O 


Having some information about [](#*), one can often use it to get upper 
bounds on |¥*{ (and thus for |¥}). 


Proposition 4.17. 


siete 

TIM F*) 
Proof. Let TC [1, k] be a minimal set with 7M ([1, k]\P) #@ for all PE W(F*). 
That is, |T| =7(11(#*)) and TZ P for all P € 11(#*). For each FE ¥*, let T(F) 
be the unique subset of F with [1(7T(F)) = T. Since TZ (FN F') for distinct F, 
F'€ ¥*, all the 7(F) are distinct subsets of X, which concludes the proof. O 


5. Intersecting families 
Let us define the family 7(n, ¢) as follows: 


te _[{K CX: [K|= (n+ 1)/2} if n +f is even, 
Hn, t) = {KCX: |KN[2,n]|=(m—-1)4+0/2) ifntt is odd. 


It is easy to check that %(n, f) is t-intersecting. Let us state and prove Katona’s 
Theorem. 


Theorem 5.1 (Katona 1964). If # C2* is t-intersecting, then \H|<|%(n, 1)|, and 
moreover, for t=2, equality holds only if X is (isomorphic to) X (n, t). 


Proof. Let us start with a definition. ¥ C 2” has the t-union property if |F U F'|< 
n-t for all F, F’€ F. 

Now #¥ = X* has the t-union property. 

We shail deal only with the case n ~ t odd; the even case is slightly easier. Set 
s=(n+1-—1)/2. Recall that f, is the number of i-sets in ¥. 


1302 P. Frankl 


Claim 5.2. 


i+t-1 n ; 
L+—_ neinit pf? O<i<s. 


Proof. Let us consider a (HOY), If A is in this family, then AEF since 
otherwise |A U B‘|=n—1t+1 holds for BEX"*', ACB, violating the hy- 
pothesis. Thus, f + |o(#°*'"")| <("). Since % is t-intersecting we may apply 
Theorem 3.8 to get 


lo(HO MLS fy Garey tt- Di, 


which yields Claim 5.2. QO 


Proof of Theorem 5.1 (continued). For i=s one has n—i-—t+1=i and from 
Claim 5.2, f,<(%_}) follows. Adding up this inequality, together with Claim 5.2 
applied to 0O<i<s and noting f, =0 for i >n —1t, we obtain 


\zt<(" ) ey (‘) =2> (" 4 = |K(n, 0). 
so Osi<s \! osixs S # 

If ¢22, then (+¢—1)/i>1,; thus in the case of equality #“'' = and 
consequently #“? = (1) for i<s, which gives already the bulk of the proof of 
uniqueness. To conclude the proof one notes that ¥°) is intersecting, and 
f= (i_-]), so by the uniqueness part of the Erd6és~Ko—Rado Theorem (which we 
will discuss subsequently) ¥ = {F E (!"1): 1 E F}. This implies ¥ = H(n, 1). 


Theorem 5.3 (Kleitman 1966b). Suppose that ¥ C 2% satisfies |F AF’|<n—t for 
all F, F'€ ¥. Then |¥| <|H(n, 0)|. 


Proof. In view of Lemma 4.6 we may repeatedly replace ¥ by D(#). Thus by 

Proposition 4.7 we may suppose that ¥ is hereditary. Since for arbitrary G, 

G' € ¥ we can take subsets F, F'E ¥ with F AF’ =GUG’, § has the tunion 

property. Thus Theorem 5.3 follows from Theorem 5.1. ma 
Let us define some intersecting families #(k,s) for 2<s<k: 


Ik, 8) = {1 E (I), LE and [2,8 + 1} +9 


U {1 E (Ml). (2,34 cu} ; 


It is easy to check that for n > 2k, |(k, 3)| <---> <|ae(k, k)| holds. 
Checking the degrees one sees that 


sen (1) C579) 
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Theorem 5.4 (Franki 1987a).: Let -¥ C ('!) be intersecting, n> 2k. If 


a ce ese) 

a= (F3)- (Cana 

holds for some 2<s <k, then |¥|<|3(k, s)|. Moreover, equality holds only if ¥ 
is isomorphic to 3(k,s), or s =3 and F is isomorphic to #(k, 2). 


Let & C(*) be an intersecting family in which the intersection of all sets 
satisfies (}¥ =§. That is, for each iG X there is some FE ¥ with i¢ F. This 


implies 
: n-\ n-~k-1 
ao<(t_4)-{ ke ). 
Thus Theorem 5.4 implies: 


Theorem 5.5 (Hilton and Milner 1967). If # C(¢) is an intersecting family with 
(1 ¥% =9, then for n>2k, |\¥|<|H(n, k)| with equality holding uf and only if 
F= H(n,k), ork =3 and #= H(n, 2). 


Let us mention that the restriction n >2k is essential because for n=2k a 
family ¥ C (741) is intersecting if,and only if it contains no set together with its 
complement. Thus there are 2-1) distinct intersecting families with (7*7/) 
members in (1). Can they be regular, i.c., d,(/) = d for some d and all i € [2k]? 
Simple computation shows that d = $ (2% > 1) which is an integer if and only if k is 
not a power of 2. 


Theorem 5.6 Slat and Daykin 1972). There exists a regular intersecting family 
of maximum size (?7k— |) in (PE!) if and only if k is not a power of 2. 


Definition 5.7. Let A denote the set of all even integers 2k such that there exists 
an intersecting family # C(P4!) with |[#|= (24>!) and such that the auto- 
morphism group Aut(¥) is transitive on [2k]. 
Theorem 5.8 (Cameron et al. 1989). 
(i) If aE A then abE A for bE A and for b odd. 
(ii) 4a+2€A for all positive integers a. 
(iii) 3-2° A for k =2. 


Actually, an even number 2é € A if and only if there is a transitive permutation 
group on [2k] in which every 2-element has a fixed point. 


Conjecture 5.9. a- 2“ A holds for every fixed a and d> d,(a). 


The maximum size of ¢-intersecting families in ({) is determined by the 
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Frd6s-Ko-Rado Theorem for n = n,(k, 0). However, for t= 2 this leaves open a 
whole range of cases 2k -—t<n<(k—t+1)(t+1). Define the ¢-intersecting 
families A, = ¥(n,k,t) for 0<i<k —1 by: 


a={ae (7), JAnfiraeite. 


Conjecture 5.10 (Frankl 1978). [f # C(*) is intersecting and n>2k —t,k2t2> 
2, then 


\#| <max |4,| . 
t 
Let us prove a weaker statement. 


Proposition 5.11. If # C(%) is t-intersecting and n 22k —t, then 


n 
IFi<(,",)- 


Proof. In view of Lemma 4.2 we may assume that ¥ is stable. The following 
lemma is often useful. 


Lemma 5.12 (Frank! 1978). If ¥ C (4) is t-intersecting and stable, then |F N F'N 
(2k — >t, Le., Fy _,, is t-intersecting. 


Proof. Suppose that Lemma 5.12 is not true and choose a counterexample (F, F’) 

with |F 9 [2k — 1]| as large as possible. Fix jG FOF’ with j>2k—¢. fi FUF’ 

for some i€ [2k ~t]}, then replacing (by Proposition 4.4) F by (F\{j})U {i} 

contradicts the maximality of |F M [2k ~— ¢]|. Thus F U F’ D [2k — t]. However, 
\(F U F’) [2k — a] <|FL + |F'}- |FOF’ a [2k —t]] <2k-8, 


a contradiction. QO 


Proof of Proposition 5.11 (continued). Apply induction on k. The case k =t is 
trivial. Also, in the case n = 2k —¢ one has |¥|<(74,-') = ?A-/). Letn >2k-t 
and define 


2k~t 
g,={ae( F I) aF Eg, A=FN[2k~4}. 
Then by Lemma 5.12 and induction, 
2k -1t 
<5 ,) 


i-t 
holds. This implies 


13 =U). 7 


at 
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Theorem 5.13 (Kleitman 1966a). Let F,,...,%C2* be intersecting. Then 
|\F,U-*- UF] <2" 2", 


Proof. Apply induction on.r;. the case-r = 1 is just Theorem 1.1. We can: assume 
that ¥,,...,%, are filters. Consider F=F,U---UF_,: By the induction 
hypothesis, |¥]<2”—2"~"*'. Also |[%] <2”"' by Theorem 1.1. Since ¥ and ¥, 
are both filters, using Theorem 3.11 we obtain |¥N ¥| =|F|-|F|/2". Summa- 
rizing, 


IF UF | 


IF,U- UF) =|F] + |F]—|F OF] <|F] + |F|——— Sa 


The right-hand side is monotone increasing in both |¥| and |#|. Thus we get 
an upper bound by substituting |¥| = 2" ' and |¥] =2” — 2” -"*'. This completes 
the proof. CT 


Another application of Theorem 3.11 is the following result which was proved 
originally in a different way by Daykin and Lovasz, and Schénheim. 


Theorem 5.14. If ¥ C 2% is intersecting and has the “union property” (FU F' #X 
for F, F'€¥), then |F| <2". 


Proof. Define 
¥*=(GCX: IFEF, FCG), and ¥,={G: AFEF, GCF}. 


Then ¥* is an intersecting filter and ¥, is hereditary and has the union property. 
Using Theorems 1.1 and 3.11 we deduce 


[Fl <|F* OF | <|F*||F,\/2" <2"? QO 


It was shown by Frankl (1975) (proving a conjecture of Katona) that the 
maximum size of an intersecting family having the f-union property is |[H(n — 
1,)]. 


Example. Let ¢, ¢' = 1 and suppose X = Y UY’ is a partition with |Y|=¢, |Y’] 20’. 
Let § C2" be a copy of X(\Y|,1) and @C2” be a copy of X([Y', t’)*. Set 
@Y)={HCX: HNYES, HNY' ES). Then © is t-intersecting and has the 
t’-union property. 


Conjecture 5.15. If # C 2* is t-intersecting and has the f’-union property, then 
[F| = |€(Y)| for an appropriate Y CX. 


This conjecture can be found in Frankl’s dissertation of 1976 and first appeared 
in English in Bang et al. (1981). 
Let us close this section with the following important conjecture of Chvatal. 
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Conjecture 5.16. If ‘6 is hereditary, .¥ C6, and ¥ is intersecting, then 
A(€). 


B\s 
For some partial results and references on this conjecture see Miklos (1984). 


6. Families with prescribed intersection sizes 
Let L= {ly,...,4,-,} {0,4 -— 1) with I, <1, <--+-<1_,. 
Definition 6.1. A family ¥ C ({) is called an (7, k, L)-system, or an L-system for 


short, if |F M F’| € L holds for all distinct F, F’ € ¥. For example, a ¢-intersecting 
family is an L-system with L = {t,¢+l,...,k- 1). 


Definition 6.2. Let m(n, k, L) denote the maximum size of an (n,k, L)-system. 
The next fundamental theorem was first proved by Deza. 
Theorem 6.3 (Deza et al. 1978). 
mn, kL) <P] (@-Di(k-2) forn>ny(k, L). 
1EL 
We remark that an L-system ¥ C(%) with L = {0,1,...,¢— 1}, is called a 
partial t-design and clearly Theorem 6.3 holds for all n=k in this case. A 
celebrated result of RGdl (1985) is the following. 
Theorem 6.4. 
n /k 
m(n, k, (0.4,....¢-1))=(1 -oy(")/(F), 
where k >t>0 are fixed and n>. 


Taking L = {t,t+1,...,.k—1}, one sees that for n >n,(k, L), Theorem 6.3 
extends Theorem 3.2. 


Definition 6.5. We say that Theorem 6.3 is asymptotically exact (respectively, 
gives the correct exponent) if 


—1 
lim sup m(n, k, 1) /T — 
eed 1EL 
is equal to one (respectively, is positive). For example, Theorem 6.4 shows that 
Theorem 6.3 is asymptotically exact for all K=>¢>0 and L = {0,1,...,¢- 1}. 


Definition 6.6. L —a = {1—a:1EL). 
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In view of the following result of Deza et al. (1978) we may suppose tn what 
follows that 0€ L. 


Proposition 6.7, m(n, k, L) = m(n — Iq. Kk — ty, L~ly) for n> nglk, L). 


The next result gives some values of k and L for which Theorem 6.3 is 
asymptotically exact. - 


Theorem 6.8 (Frankl and Rédl 1985). Let d=t>0 and let q be a prime power. 
Then Theorem 6.3 is asymptotically exact for 


kee:  Geisag- 4 
and 

k#=(q*-\)l(q-1), L={(q'-1)/q-1):i=0,1,...,t-1). 
Definition 6.9. a(k, L) =sup{a: limsup,_,,. m(n,k, L)n-* > O}. 


That is, a(k, L) <|L]| with equality if and only if Theorem 6.3 gives the correct 
exponent. Clearly a(k, L) 21 for all 64 L C[0,k — I]. 


Conjecture 6.10. There exist positive constants c(k, L) and é(k, L) for all k, L 
such that 


c(k, Lyn! <m(n, k, L)< é(k, Lyn") ; 


Theorem 6.11 (Frankl 1986b). For every rational number a = | there are infinitely 
many choices of k and L for which Conjecture 6.10 holds with a(k, L) =a. 


One can use Lemma 4.15 and Proposition 4.16 to get upper bounds on a(k, L). 
Let ¥ be an (n,k, L)-system and apply Lemma 4.15 with 7=k +1 to get the 
intersection-closed family of = 11 (#*) c2"!. 


We call a set B C[k] a base (for #) if BZ A for all A € of but no proper subset 
of B has this property. Also, b(s/) = min{|B|: B is a base}. 

For D C[k] define (D) = 1) {A: DCA E (oH U {[k]})}. That is, (D) = [A] if 
and only if D contains some base for ». 

Since ¥* is an L-system, |A| € L for all A € sé. By Proposition 4.16, there is at 
most one /,-element set in and one can prove easily that b(s¢) <|L|. In fact, 
more is true. For elementary properties of matroids, we refer the reader to 
chapter 9. 


Theorem 6.12 (Frank! 1982). b(o)<|L|— 1 unless of U[K] forms the flats of a 
matroid of rank |L|. In this case b(x4) =|L}. 


Proof. We apply induction on k; the case k = 1 is trivial. Suppose that b(of) = 
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IL]. Define of, = {A € of: |A| =1,}, 0<i<s =|L|. We have to show that for every 
AE, and x € [k]\A, there is a unique member of .f,,, containing both x and A. 
Define A= 1) {A's (AU {x}) CA’ EA). Then A Eo. All we have to show is 
A€,,,. It is easy to see that there exists a set D with (D) = A, |D| <i. Also, if 
A€&o,, then one can find a set E with |E|<s—j and (AUE) =[k]. Thus 
(DU{x} UE) =[k], giving i+ 1+s—j<s, ie., f<itt. Since |A[>1,, f= 
i+ 1 follows. Oo 


Definition 6.13. Define b(k, 1.) = max b(.o% ), where the maximum is taken over all 
intersection-closed families «/ C 2"! with |A|€ L for all AE. 


Conjecture 6.14 (Ftiredi 1983). a(k, L) > b(k, L)—1 for all k and L. 


Since a(k, L) = b(k, L) by Proposition 4.17, this conjecture would mean that 
[a(k, L)] = b(k, L) holds. 

The smallest open cases are L = {0,1, 3}, k =1 or 3 (mod 6), k = 13 [b(k, L) = 
3 in this case, but a(k,L)>2 is unknown for k# 37 or 27- 1], and L= 
{0,1,2,3,5}, A=11 [b(k, L) =S in this case]. Recently, all exponents for k <10 
were determined by Frankl et al. (1995b). 

In Deza et al. (1985), an infinite family of cases where Theorem 6.3 gives the 
correct exponent is exhibited, e.g., L = {0,1,2,q¢+1}, k=q’+1, q a prime 
power. 

For k and L with b(k, L) = 1, Conjecture 6.14 is obvious, since then a(k, L) = 1 
follows from a(k, L) <b(k, L). If b(k; L)=2, then a(k,L)>1 follows using 
constructions due to Frankl (sec Fiiredi 1983). 

A general upper bound, extending earlier results of Ray-Chaudhuri and Wilson 
(1975) and Babai and Frankl (1980), is the following. 


Theorem 6.15 (Frankl and Wilson 1981). Suppose that p is a prime such that k #1 
(mod p) holds for all l€ L. Let r be the number of residue classes of L modulo p. 
Then 


n 
r 


m(n, k, L) = ( ). 


7. One missing intersection 


An important special case of the problem treated in the preceding section is when 
L=[0,k —1)2} for some /€[0, k — 1]. 

Set m(n, k, 1) = m(n, k, (0, k — 1). 

There are two natural constructions for excluding the intersection size /. One is 
by taking all k-subsets of X containing a fixed (/ + 1)-element subset. This gives 


- n-Il-1 
m(n,k,1)=(7 14). 
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The other is by taking a partial /-design. By Rédl’s Theorem 6.4 this gives a 
lower bound of (1 — o(1))(7)/(4). The next result of Frank! and Fiiredi (1985) 
shows that one of these constructions always gives the correct exponent. 


Theorem 7.1. m(n,k, N = OG Ny, 


Proof. Consider « =|] (%*) from the preceding section. We have to show that 
b(f) < max{l,k —1—1}. Let B be a base for and suppose that |B] 2/. For 
x €©B consider A, = (B\{x}) € oH. Note that A, M B= B\{x}. Define the family 
(of not necessarily distinct sets) 


€={A\B: xE By Ck, 


Claim 7.2. The size of the intersection of r members of € is never r—c, 
1<r<|B|=|@|, where c=|B|—1>0. 


Proof. Since for distinct elements x,,...,x,€ B one has |A,, Ax OAS B\= 
|B|—r, |A, A+++ A, | #1 implies the claim. o 


Proof of Theorem 7.1 (continued). Now a simple result of Franki and Katona (cf. 
Frankl and Furedi 1985) says that any family @ of not necessarily distinct subsets 
of a b-element set and satisfying the assertion of Claim 7.2 has |€|<b+c¢-—1. 
Since in our case b = k — |B}, c = |B|—1, we infer that |B| =|@|<k —1-1. Since 
B was an arbitrary base for ., the result follows. O 


For the case k > 2/ +1, more is true. 


Theorem 7.3 (Frankl and Fiiredi 1985). m(n,k, 7) =(4°4>!) holds for k=21+2 
and n>n,(k). Moreover, the only optimal family is ¥ = {F € ('7!): {1+1)C F}. 


For k <2/ + 1 one can improve on the lower bound given by partial /-designs. 


Proposition 7.4. Let ? C(,,"")_,) be a partial l-design. Then |F M F'| #1 for all 
F, F'€a,(P). 


Proof. Take P, P’€ ? with FC P, F'C P’. If P#P’, then |FN F'|<|POP'{ <1. 
If P= P' then [FO F'|=|F| +|F']—|P]=1 41. Oo 


Using Theorem 6.4 again one obtains 


mink deco Y(Q/CEE). 


This inequality is partially complemented by the following result of Frankl (1983). 
Recall that an S(n,a,/) is a partial /-design ¥ C (')) with || = (4)/(4). 
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Theorem 7.5. 


mace HQ)/C I 


holds if k 221+ 1 and k —lis a prime power. Moreover, if k — lis a prime, then 
equality is achieved only for o,(f) where F# is an S(n, 2k —1—1,1). 


Conjecture 7.6. Theorem 7.5 holds even if k —/ is not a prime power. 


Settling a long-standing open problem of Erddés (cf. Erdos 1981), the following 
result was proved in Frank! and Rodi (1986). 


Theorem 7.7. Let 0<a<} and I be an integer, an<1<(4 —a)n. Then there 
exists € = e(a) > 0 such that every family ¥ C2!" with |#| > (2 - e)" contains two 
sets whose intersection has size exactly 1. 


For / fixed and nv sufficiently large the problem was solved exactly by Frank! 
and Fiiredi (1984a). To avoid intersections of size / one can take H(n, | + 1) which 
is (+ 1)-intersecting from Katona’s Theorem 5.1 and adjoin all subsets of size 
less than J. 


Theorem 7.8. If ¥ C2” satisfies |F O F'| #1 for all distinct F, F' EF, then 


|[F| <|H(n, 1+ 1)[ +> (‘) 


i<l 


forn>n,(l). 


An important tool in the proof is the following result extending Theorem 3.8 on 
the shadow of t-intersecting families. Recalling the definition of M, we have: 


Theorem 7.9. Suppose that the columns of M(j, ¥) are linearly independent over 
R, where ¥ C(X). Then |o(¥)|/|F| = (A P/)/C E) for all js <k. 


The following problem was raised by Larman and Rogers (1972). Determine 
s(n) = max{|F|: ¥ C2"), |F AF'| #n/2 for all F, F’' EF}. 


It is easy to see that s(n) = 2" if n is odd and that s(n) = 2" | ifn =2 (mod 4). Let 
n=4l and consider the following family: 


Rl) ={R, KR REW" |RO[n— | <1- 1}. 


Then [2()| =4 U,. (4/5!) and [R AR’| ¥ 21 for all R, R'E RW). 
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Theorem 7.10 (Frank! 1986a).. s(4/) = 4 )),..,(*;') if | is the power of an odd 
prime. 


Conjecture 7.11. Theorem 7.9 holds for all positive integers /. 


8. s-wise t-intersecting families 


Let q(n,s, t) denote the maximum size of an s-wise t-intersecting family ¥ C 2”. 
For a more complete treatment we refer the reader to Frankl (1987b). 


Proposition 8.1. g(n,s,t)/2°" is monotone increasing and therefore q(s,t)= 
lim, _.. q(n, 8, 0)/2°” exists. 


Proof. If ¥ C 2“ is s-wise t-intersecting, then so is ¥’=#FU{FU {n+ 1}: FE 
#¥}, showing q(n+1,s,t)22q(n,s,t), as desired. The second part of the 
proposition is a direct consequence of the first part. O 


From the proposition we see that q(s,f)<4 for all s22, t21. Since 
lim, |%(n, 0/2" =4, q(2, 0) =4 for all #2 1. 


In view of Lemma 4.2 (iii), from now on ¥ C2* will be a stable, s-wise 
t-intersecting family of maximum size. (Consequently, ¥ is a filter.) Define the 
sets: 

A, =([n]\{t+i+ ps: 0S p<(n—t-—i)/s} 
for 0<i<s and note that 
AyQ::NA,_,=[t-1]. (8.2) 


Lemma 8.3. (i) Ay ZF; 
(ii) for every F € F there exists a j 20 with |F N[t+ sp]|=t+ (s — 1)p. 


Proof. Since ¥ is a stable filter, A, € & would imply by repeated applications of 
Proposition 4.4 that A;& ¥%, 1<i<s—1. However, by (8.2) this is impossible, 
which proves (i). To prove (ii), suppose that F=[n]\{ay,...,a@,} is in F, 
1<a)<---<a,. If a,<t+ps for 0<p<(n—1)/p [in particular, />(n — £)/p], 
then A, € ¥ follows from Proposition 4.4, contradicting (i). Thus for some p we 
have a, >1 + ps, i.e., |F A[t+ ps]| =t+ p(s — 1), as desired. Oo 


Let us consider the polynomial x° — 2x + 1, for s=3. It has exactly one root, 
say B(s), in the open interval (4,1). For example, B(3) = (V5 — 1)/2. 


Theorem 8.4 (Frankl 1976). q(n,s, 6) <2"B(s)’. 


Proof (sketch). Consider the probability space of all infinite (0, 1)-sequences with 
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the uniform distribution. Standard computation shows that the probability of the 
event {there exists p > 0 such that the number of I’s up to ¢+ ps is =¢ + p(s — 1)} 
is B(s)'. By Lemma 8.3 this is a (strict) upper bound on |¥[|/2" [we associate with 
Fe F all the (0, 1)-sequences extending its characteristic vector]. i) 


Delince the families: 
B, = B,(n,s,t)={BC[n}: |B [e+ spi =t+(s- Ip}, p-ols. 


Then &, is s-wise t-intersecting and |%,|/2” is independent of n. The following 
result combines Theorem 8.4 and some computation involving |R,|/2". 


Corollary 8.5. There exists a positive constant c such that cB(s)'/t < q(t, s) < B(sy’. 
Conjecture 8.6. q(n, 5, 1) = max{|B,|: 0<p <(n —d/s}. 


Let us mention that Conjecture 8.6 holds for s = 2 (Katona’s Theorem) and in 
general for ¢<s-2°/150 (Frank! 1979). It also holds for s=¢22 with q(n,s, t) = 
2""'. Next, we show how to use this last result to give a simple proof of an 
important theorem of Brace and Daykin (1971). 


Theorem 8.7. Let ¥ C2!"! be s-wise intersecting with (1) ¥ =. Then 
|F| <|B,ln,s, Y= (54292 


Proof. We may suppose that ¥ is a filter and thus, since () ¥ =9, it contains 
[n]\{i} for all 1<i<n. This will not change by shifting. Therefore, we may 
assume that ¥ is stable. 

We apply induction on s. For s=2, one has |%,(n, 2, 1)]=2""'; thus the 


statement follows from Theorem 1.1. Let s 23 and suppose that Theorem 8.7 has 
been proved for smaller values of s. Consider #(1) and ¥(1). 


Claim 8.8. (i) \¥(A)| <(s 4+ 1)2"" so '. 
(ii) [FD] s2"*. 


Now the theorem follows from |¥| = |¥(1)| + |F(1)| once we prove the claim. 


Proof of Claim 8.8. Note that ¥(1) is (s — 1)-wise intersecting on [2,n], since 
otherwise F, N---OF,_, = {1} for some F,,..., F,_, implying {1} € MN F. Also, 
(n}\{i}) © F implies ({2, n]\{i}) E€ #1) for 2<isn. Thus () #(1) =. Hence, 
(i) follows from the induction assumption. To prove (ii), we only have to show 
that %(1) is s-wise s-intersecting (on [2,]). Otherwise, since ¥(1) is a stable 
filter, we can find F,,...,F,€ FA)C¥ with F,N--- NF, =[2,s]. Define G, = 
(FAQ) U {1} for i=2,...,5. Then G,;€ ¥ by Proposition 4.4. However, F,M 
G,M-::NG, =9, which is a contradiction. i) 
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9. The covering number 


Recall the definition of +(F). 


Theorem 9.1 (Gyarfas 1977). A k-graph ¥ has at most k’”? covers T of size 
Proof. Set (=7(¥). We prove by backward induction on /<1 that every 
t-clement sct is contained in at most k' ‘ covers of ¥. The case 1 = ¢ is trivial and 
the case /=0 will prove the theorem. 

Let 0</<1# and consider an /-element set A. Since 1<1=7(#), there exists 
an FE¥ with ANF =. Every cover of ¥ containing A must contain at 
least one of the (/+1)-element sects AU {x}, x EF. Each of these sets is (by 
induction) in at most k'~'~' covers of F of size t. This gives altogether k =k’ 7! = 
ome QO 


For a generalization see Tuza (1988). 
Considering 7 pairwise disjoint sets of size k shows that Theorem 9.1 is best 
possible. An important corollary of the theorem is the following. 


Theorem 9.2 (Erdés and Lovasz 1975). Let ¥ be an intersecting k-graph with 
(¥) =k. Then \F\<k*, 


Proof. Every F € ¥ is a cover of size k. 


Construction (Erd6s and Lovasz 1975). Let X,,..., X, be disjoint sets of size 
1,...,k, respectively. Define 


6, = {E: |El|=k,X,CE,X,NE4O,i<jsk}. 


Set 6 =6,U-- UG. 


Now @ is intersecting with 7(@) =k and |@|= |[k!e|. Lovasz conjectured that 
no intersecting k-graph with covering number k has more edges, but this is 
disproved in Frankl et al. (1995b). 

How few edges can such a k-graph have? 

Let g(k) denote the minimum size of a k-graph ¥ with 7(F#) =k. Erdés and 
Lovasz (1975) show that g(k) = 8k/3—3 and they conjecture that lim,_,,. g(k)/ 
k =o, However, using an ingenious construction, Kahn (1992) proved that 
a(k) = O(k) holds. 

Let # be the set of lines of a projective plane of order k — 1. Then # has the 
following strong property. 


Claim 9.3. If S is a cover of P with |S|=k, then SEF. 
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Proof. Suppose that S$ is not a line and let LE be a line with |LNS|22. 
Choose x © L\S. Then there are k —1 lines besides L through x, and each of 
them has to intersect S. Thus |S] =>2+k—-1>k. Oo 


Such an intersecting family is called a maximal intersecting family, i.e., the 
addition of any new k-set destroys the property of being intersecting. 

Let f(k) denote the minimum size of a maximal intersecting k-graph. Meyer 
(1974) conjectured that f(k) =k? —k +1 with equality if a projective plane of 
order k—1 exists. This was disproved in Firedi (1980) by the following 
construction. 


Example 9.4. Let of be the family of lines of an affine plane of order k and let 
A =£,U---UL,,, be the partition of the lines into parallel classes. Consider 
three vertex-disjoint copies /', sd’, and ° of of and let Li,..., Li, be the lines 
in £).,,. Define: 

F={LiUL: LE L)"',i=1,2,3, f=1,...,k}. 


Then |¥| = 3k? and ¥ is a maximal intersecting family, showing f(2k) < 3k? if 
an affine plane of order k exists. 


Theorem 9.5 (Boros et al. 1989). f(q + 1) <q’/2 + O(q) for gq =—1 (mod 6), qa 
prime power. 


Theorem 9.6 (Blokhuis 1987). f(k)<k° for all k. 


Thus, Theorem 9.6 gives a polynomial upper bound for all k. However, it is not 
even known whether lim,_,, f(k)/k =. 


10. +-critical k-graphs 
Let us start with the following result of Bollobas (1965). 


Theorem 10.1. Let {A,,...,A,,} and {B,,..., B,,} be two families of subsets of 
[n] satisfying 

(i) A; B, =9,1<i=<m, 

(ii) A,;NB, 49, tSiAzj<m. 


Then 
5 (MLB 


Laden |A,| 


Proof. Apply induction on n; the cases n=O, 1 are trivial. For notational 
convenience we shall speak of the two families as a set-pair family {(A,, B,): 
1<i<m} satisfying (i) and (ii). 
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For each x €[n] consider the set-pair family Y = (A,, B,\{x}), where i runs 
over i with xA,. Then # satisfies (i) and (ii). Applying the induction 
hypothesis to Y on [n]\{x} and adding up the corresponding inequalities one 
notes that ('4* ae ie occurs n ~|A;| — [B,| times, and ('47 Ve ')"! occurs |B,| 
times. Thus we ‘have : 


, cae (eat) 
Poe man ee 
Dividing by n, the theorem follows. QO 


Tuza (1984) notes that the inequality of Yamamoto (1954) is a consequence of 
Theorem 10.1. 


Corollary 10.2. Let {A,,... 


> (2 on 


b<ixm |A,| 


A,,} be an antichain on X. Then 


Proof (Tuza 1984). Set B,= X¥ — A, and note that the hypotheses of Theorem 
10.1 are fulfilled. 0 


Recall the definition of 7-critical families. 


Corollary 10.3. If sf ts t-critical with t() =, then 


{Al +2-1\7! 
2 t-1 Pte 


AG 
Proof. Let sf = {A,,...,A,,} and let B, be a cover of size ¢— 1 for W\{A;}. 
Now, apply Theorem 10.1. O 


Note that Corollary 10.3 implies that |s¢|<(**{~') for every 7-critical k-graph 
with 7(@) =¢. Considering (1* *{~'!) shows that this is best possible. 

This result was re-proved and extended in several ways. We refer to the survey 
of Fiiredi (1988) for a full account. Here we mention only two related results. 


Theorem 10.4 (Fiiredi 1984). Let (A,,...,A,,) be a collection of a-sets and 


m 


(B,,..-.,B,,) 4 collection of b-sets such that \A, NB <t for alliand |A,NB|>« 
for l<i<jam. Then m= (4+,2 574). 


Theorem 10.5 (Tuza 1985). Let {A,,...,A,,} and {B,,...,B,,} be collections 
of sets with A, B,-® for all i and (A, AB)ULA, NB ) eis for iAj. Then 
Vis Pp aa giil< 1 holds for all positive p dnd q with ptq=l. 


Proof. Let [”] be the union of all the sets A, U B,. Consider all subsets of [2] with 
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a weight function w(E) = p!*!g"" il Define 4,={ECjn]: A;CE, B,NE=9} 


and note that «@,,...,,, are pairwise disjoint. Also, note that 
2 w(E) = pl*iql? 
Ee, 


Now we can deduce the result: 


> pial 1B} BS > W(E) < > pi'g"” ely. oO 


l<ism Isism EES, EC{n] 


For a more general result sec Tuza (1988). 


Il. Matchings 


Let s = 2 be fixed. How large can a family ¥ C 2* be if »(¥) <s? For s = 2, this 
means that ¥ is intersecting and the answer 2”"' was given in Theorem 1.1. 

Let v(n,s) denote max |¥|, where FCW, AF) <s. Clearly, v(n + 1,8) = 
2v(n, s) holds for all n. Considering 4 = {K C X: |K|>n/s} shows that 


p(n, 5) = >> (") ‘ 


i>als t 


Kleitman (1968a) showed that this is best possible for n = —1 (mods), 


Theorem 11.1. 


v(bs — L9=% (") ; 


v(bs, s) =2v(bs — 1,5). 


For n #0, —1 (mods), the value of v(n,s) is unknown, except for s = 3, where 
Quinn (1987) showed that for n = 3b + | the best construction is _ 


a=xu{oe("™). 1eQ}= {Oc In}: 2] + /Onfiizo+1}. 
Conjecture 11.2. For n=bs+r, l<r<s, 
v(n,s)=|{K C[n}: |K| + |KO |s—r-1|2o+1}]. 


A problem with a similar flavor was solved by Kleitman (1968b) for s = 2 and, 
using the same technique, by Frankl (1977) for all s. 


Theorem 11.3. Let n = bs + 5 — 1 and suppose that ¥ C2" contains no s pairwise 
disjoint sets along with their union. Then 


\F¥|<|GC In}: b<|G| <bs}}. 
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Again, the maximum value is unknown for n#—1 (mods). Let v(n,s,k) 
denote max|¥|, where ¥ C({) and »(¥)<s. To avoid trivialities, suppose that 
n=sk. 


Example. = (0, "!), €, = @,(n) = (EC (Ml): EN{s — 1} 49}. 
Conjecture 11.4 (Erdés 1965). v(n,s, k) = max{|%,|, |, |}. 


Erdés (1965) proved that for n >n,(s, k) the conjecture is true and @, is the 
only extremal example. Bollobas et al. (1976) show that n,(s, k) <2yk* holds. 
The next proposition is essentially due to Kleitman (1968a). 


Proposition 11.5. v(ks,s,k) = (4, ') and, for s = 3, the only optimal family is €). 


Proof. Take ¥ C(l41) with »(#)<s—1. Consider a random partition P= 
(P,,...,P,) of X. That is, P/U---UP,=X, |P,|= and all P have the same 
chance of being chosen. Then the probability of the event P,E ¥ is |F{/(4). 
Thus the expected number of i with P,€ ¥ is s|¥|/(%). On the other hand, 
v(¥)<s implies that this number is always less than s. Thus s|¥|/(43) Ss — L. 
{One can come to the same conclusion by the double-counting argument of 
Katona (1974).} 

Rearranging gives |¥|<(**, '), with equality holding if and only if out of each 
partition P, exactly s —- 1 sets are in ¥. That is, (¥)\¥ is an intersecting family of 
size (4°) — (*, ') = (2,1). Now the uniqueness of ¥ for s =3 follows from the 
uniqueness part of the Erdds-Ko—Rado Theorem (see Theorem 5.3). O 


Proposition 11.6. v(m, s,k)<(s —1)(42!) for all n= sk. 


Proof. Use induction on n. The case n = sk is covered by Proposition 11.5. Let 
¥ C(f) be a family with |¥| = v(n,s,k), oF) <s. In view of Lemma 4.2 (iv) we 
may assume that ¥ is stable. Consider the two families F(7), #(n). 


Claim 11.7. [F(A <6 — YGE=3), [FO <6 — NUE). 
Since |¥| =|¥(#)| + |F(@)], this implies the theorem. 


Proof of Claim 11.7. The first inequality is true by induction. To prove the second 
we have to show v(¥(n))<s. Suppose the contrary and let G,,...,G, be 
pairwise disjoint sets in ¥(n). Since |G,|+--++|G,j=(k — 1)s, we can find 
distinct elements x,,...,x, E[n]\(G, U---UG,). Since ¥ is stable, G, U {x;} is 
in ¥. That is, »p(¥) = s, which is a contradiction. nN 

Formulating Proposition 11.5 for the complements = ¥°, we obtain that an 
s-wise intersecting family @C(,,(*°),,,) can have at most (,;2 1.) members. 


s 


This was generalized by Frankl (1976). 


1318 P. Frankl 


Theorem 11.8. /f GC ('"!) is s-wise intersecting, n= sl/(s — 1), then |9|<(7 1). 
Moreover, unless s = 2, n = 21, equality is achieved only if $= {G €("!): 1EG}. 


For a new proof see Frankl (1987b). 


12. The number of vertices in +- and v-critical k-graphs 


Following Tuza (1985). let us call P = {(A,. B,): 1<i<m)} an (a, b)-system if 
\A,| =a, [B,| = 6 for ali é, and moreover, Theorem 10.1 (i) and (ii) hold. 
Let n(a,b) be max|Uj!,(A,;UB,)|, where the maximum is over all (a, b)- 


systems. Let ,(a,b) be max|U?_,A,|, where the maximum is over all (a, b)- 
systems. 

As we saw in the proof of Corollary 10.3, to every r-critical k-graph of with 
1(@)=1t one can associate a (k, ft - 1)-system. This implies: 


li o|<n,(k,t-—1) if of is a r-critical k-graph with 7(f) =r. 


Obviously 1 ,(a, b) <n(a, b) = n(b, a). The following surprising symmetry rela- 
tion holds. 


Theorem 12.1 (Tuza 1985). n,(a,b — 1)=n,(b,a—1) for all a, b =1. 
Proposition 12.2 (Tuza 1985). 
, + b’ 
n,(a’ +a",b' + b")>a' +b’ + a P nica" b"). 
Proof. Let # (respectively, 9) be an (a’, b’)-system ((a", b")-system). For each 
(A;, B,) EY, let J, be a copy of 2, where Y, 9,,...,2,, are all vertex-disjoint. 
The general element of 9, is denoted by (C\”, D{). Define: 
R={((A,UCH?, BUD”): (A,, BEF, (Cj), DI?) E9,} . 
. Then & is an (a’ + a", b’ + b")-system, which proves the theorem. O 


Tuza (1985) proves the following surprisingly sharp bounds. 


Theorem 12.3. 


: 1fat+b+1 atb+t+1 
(i) =( b+ ) <n(a, b) <( ei ) fora=b,a=1; 
7 l/fatb+l at+b+t 
(ii) rl b+] )<ny(a,b)<( igs fora>1,b20. 


Let us mention that Tuza proves both the upper and lower bounds in a stronger 
form. In particular, applying Proposition 12.2 with a’ = {ab/(b + 1)|, b’ = b, he 
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obtains 


n,(a, b) 2 fal(b + 1)] fae ' Obs °) + lab/(b+1)] +b 


and he suggests that equality holds here for a > b + 2. He also conjectures that 
n,(a, b) = n(a, b) holds if and only if a= b. 

Recall the definition of a v-critical family ¥. A family ¥ is said to have rank k 
if K = max;eg |F{ holds. 

Improving carlicr bounds of Lovasz (1975), Tuza (1985) shows: 


Theorem 12.4. If ¥ is a v-critical family of rank k, then it has fewer than 
(PORK vertices. 


Proof. Set v = »(F) and let % consist of those sets which are the union of » 
pairwise disjoint edges in ¥. Let X' = {H,,...,H,,} CK be minimal with respect 
to UX' = UX. Then for every H, © X' there is a vertex x, € H, such that x, € H, 
for i #j. By v-criticality there is some F, © ¥ with F,N H, = {x;} and consequently 
(F\\{x,}) 0H, 49 for all 1% j. Now {(H,, F\{x,}): 1<i<m} is a system satisfy- 
ing Theorem 10.1 (i) and (ii), and also |H,| < vk, |F.\{x,}| =k — 1. Thus, 


k+k 
JF =I <n ok, k-<("). 


In the case v = 1 we have the following sharper results. 


Theorem 12.5 (Tuza 1985). Let v(k) denote the maximum order of a v-critical 
intersecting family ¥ with rank k. Then 


Or « aC a) Gey 
2k-4+2\ 4) =mky=\,_, J) tl, _9)}- 


Both bounds improve earlier results of Erd6és and Lovasz (1975). Tuza 
conjectures that the lower bound-given by the following construction — is 
optimal for k > 4. 


Example. For cach partition [2k —4]=F UF’ with |F|=|F']=& — 2, take four 
new vertices x,x', y, y’ and form the k-element sets FU {x,y}, FU {x’, y’}, 
F’U tx, y’} and F’U {x’, y}. These sets form a v-critical k-graph. 


For k fixed and v large we have the following. 


Conjecture 12.6 (Lovdsz 1975). There exists a constant c =c(k) such that every 
v-critical family ¥ of rank k has at most cv(¥) vertices. For k =2, the best 
possible bound 3v(¥) was shown by Gallai (1963). 
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13. Excluded configurations I 


Let € = {of,,..., 9%} be a collection of k-graphs. 

Set ex(n, €) = max|#|, where the maximum is taken over all ¥C(¥), ¥ 
containing no subfamily isomorphic to a family in ©. If @ = {.7} then we also 
write cx(n, @) instead of ex(n, €). A classical result of Katona et al. (1964) is the 
following. 


Theorem 13.1. ex(n,€)/(7) is monotone decreasing, and therefore p(G)= 
lim,,_.. ex(m, €@)/(Z) exists. 


Proof. Let 1<h<n and consider a family FC(*) without any subfamily 
isomorphic to some «© and such that |¥|=ex(n,€). Choose a subset 
H €(4) at random, with uniform distribution. Then [(/)N ¥|<ex(h, @) holds 
for all H. On the other hand, the expectation of |(7)N¥| is |#| times the 
probability that a fixed F E(%) is in (#), i-e., |F|(4)/(Z). Thus |F|(4)/(2) = 
ex(t, €)(R)/(2) Sex(h, €). 

Dividing by (#) shows the desired result. O 


It follows from a result of Erdés (1964) that n(€)=0 if € contains some 
k-partite k-graph. Actually, Erdés obtains an upper bound of the form n<~ 
where e(€) is a positive constant. The determination of the best possible value of 
e(€) seems to be very difficult even in very simple cases. In this section we 
suppose that there are no k-partite k-graphs in €. Let us first state. Turdan’s 
well-known problem. 


Example. Let [n] = X, UX, UX, be a partition with |X,| = (a + i)/3). Define: 


7 7(4,3) = {TE (5): IP 1X =1,i=0,1,2] 
uf{re(4): ITN X,|=2,|TNX,,,) =1 for some i=0, 1, 2, 
where X, denotes rae 


It is conjectured by Turan that ¢(n, 4,3) =|Z(4, 3)|. Kostoschka (1982) has 
given exponentially many non-isomorphic 3-graphs with |%(4,3){ edges and 
without a ((41). This suggests, that if Turan’s conjecture is correct, then it could 
be very hard to prove. Kalai (1985) has proposed a more general algebraic 
conjecture. 


Example. Let [n]=X,UX,U---UX,., be a partition with |X,} = [(n+1)/t]. 
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Define: 


F(n, tk - 1) +1,k) = C ~ YU (;:) 


Clearly, F(n, (k — 1) + 1, k) contains no (I 2) 'l), It is conjectured that for 
n>n,(t,k) one has |F(n, (k — 1) + 1,4)] =n, (k — 1) +1,4), although Brown 
(1983) has produced other examples with the same cardinality. 


The simplest non-3-partite 3-graph is R, =('41)42, 4].. Even for this 3-graph, 
cx(n, #,) is unknown. 


Proposition 13.2. 2 <p(R,) <4. 


Here, the upper bound is due to de Caen (1982), the lower bound to Frankl 
and Fircdi (1984b). 


With every k-graph ¥ let us associate a polynomial g(¥) as follows. 
Definition 13.3. Define q(F¥, x) = Lires Mier %;- 


Then q(#) is a homogeneous polynomial of degree k which is linear in each 
variable. 


Define the Lagrange function (¥) = max q(¥, x), where the maximum is then 
over all x=(x,,...,%,) with x,20,x,+-+-+x,=1. 
Using the theory of Lagrange multipliers one obtains: 


Lemma 13.4 (Frankl and Rédi 1984). There exists an x = (x,,...,,,) with x, 20, 
X,++++4+x, = 1, such that (i)—(iii) (following) hold. Set ¥Y = supp x = (i: x, > 0}. 
(i) ACF) = q(F, x); 
(ii) dq(F, x)/ax, = kA(F) for all iG Y; 
(iii) every pair P &(%) is contained in some edge F & ¥ with F CY. 


Note that A(¥) =|¥|/n*. Onc can use this to show the following simple result: 
w(€)=ktsup{A(F): F¥ is a k-graph without a copy of any #E}. 
(13.5) 


Katona (1974) asked for the determination of the maximum number 
symm(n,k) of k-subsets of an n-set such that none of them contains the 
symmetric difference of two others. This problem can be formulated in terms of 
ex(n, €), but for k large € will contain many k-graphs (all with three edges). 


Conjecture 13.5 (Bollobas 1974). symm(n, k) = []y<;-, [a + 1)/kJ with equality 
holding for the complete cquipartite k-graph. 


Bollobas (1974) solves the case k =3 (the case k=2 is very casy and was 
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solved by Mantel in 1906). De Caen (1986) gives a new proof for k =3 and 
proposes a different problem. 


Problem. Determine ex(n, €,) =c(n, k), where €, = {#,, #,,...,%,} with & = 
{HLA} {i,k -1U {kK +}}, [ik +i 1p}. 


Clearly, symm(n, k) <c(n,k) for all k and for k = 2, 3, the two problems are 
the same. 


Sidorenko (1987) realized the relevance of Lemma 13.4 and proved the 
following. 


Theorem 13.6. c(n, k) = To, \(n + i)/k| holds for k =2,3,4. 


Proof. To avoid technical difficulties we shall only prove c(n, k) <(n/k)* (which 
is the same as the theorem if n is a multiple of k), i.e., A(#) <1/k* if ¥ contains 
no copy of %, € €,. 

In view of Lemma 13.4 in proving the above inequality we may suppose that 
o(F) = ('4)), ie., every air PE (4!) is contained in some FE ¥. Now if F, 
F'e€& with [FN F'|=k — 1, then {F AF’ | = 2 and therefore we find F” € ¥ with 
FAF' CF", ie., (F, F’, F"} © ¥, which is a contradiction. Thus |FN F’'|<k —2 
for all F, F’ & ¥. (Note that this is not true in general for ¥ containing no copy of 
3, EG €,; however, Lemma 13.4 ensures the existence of a subfamily with this 
property and the same value for the Lagrange function.) That is, (,°,)NG",) = 
@ for distinct F, F’ E& ¥. In other words, dq(F, x)/ax, and dq(F, x)/0x, have no 
common term for i #j. Let 


Sg— (4) = 2, I] x; 
Ae(t"hy iGA 
be the (k — 1)th elementary symmetric polynomial. Adding up Lemma 13.4 (ii) 


for 1<i<n, we obtain 


n 1\*-! 
knnF) <3, 0) <(,",)( ) : 
Rearranging gives 
(n~1)---(2-k +2) 
kin’! 
Now for n =k, the right-hand side of (13.7) is at most k-*, both for k =2 and 
k =3, and also for k =4 unless n=5. However, the case n=5 is impossible, 


because any two 4-subsets of [5] overlap in three elements. This concludes the 
proof. O 


MF) 


(13.7) 


Using the same approach, Frankl and Furedi (1989) determined p(6,) for 
k=5 and k= 6, 
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Let W,, (W,,) be the (unique) (11, 5, 4) ((12, 6, 5)) Steiner-system. That is, 
W,, C ("21) and for each A & (!'2!) there is a unique set BE W,, with SC B, and 
W,, = W, (12). 


Example. For X= X,U-:-UX)), 1X] =n/)2, define: 


B, = [Be (*): {i: BOX, 4 BSE Wo} : 
8, is defined analogously. 


Theorem 13.8. (i) ex(n, €,) <66(n/11)° with equality iff 11|n, in which case B, 
is the only optimal family. 

(ii) ex(n, ©) < 132(n/12)° with equality iff 12|n, in which case %, is the only 
optimal family. 


14, Excluded configurations II: k-partite k-graphs 


Many of the problems treated earlier can be formulated in the form: determine 
ex(n, €). For example, the determination of m(n, k, L) is such a problem. Let us 
start with three problems which come up in other contexts. 

Call a family ¥ C2* barely overlapping if FZ F'U F" holds for all distinct F, 
FB’, F°e ¥, Let h(n, k) denote the maximum size of a barely overlapping family 
F(X). 


Theorem 14.1 (Erdos et al. 1982). (i) A(n,2/—1)<(4)/(77') with equality 
holding iff there exists an S(n, 2! —1, 1). 

(ii) h(n, 21)s("7')/(" 7!) with equality achieved for some ¥ if and only if 
IN Fl =1 Gay NF = (1}) and F(A) is an S(a — 1, 22 - 1,1). 


Proof. We only prove (i) and even this only for n = 3. Let GC F © ¥. We call G 
a distinguished subset of F if GZ F' for all F# F'€ ¥. Let us define a weight 
function w: ¥ x (MYR, by: 


w(F, G) 


F 
1 if GE (‘) and G is an eigen-subset of F , 


F 
WW if GOF=HE ee ) and H is an eigen-subset of F , 


0 otherwise . 


Claim. )), w(F, G) <1, 0 w(F, G)= (27 '). 


Proof. The first part follows by noting that if G is an cigen-subset of F, then no 
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subset of G can be an cigen-subset of some other F’ © ¥ and w(F, G) = 1/l can 
‘hold for a fixed G at most / times, once for each of its (J — 1)-subsets 

To prove the second part, note that if F= AUB, {A|=J, |B|=/—1, then 
either A or B (or both) are eigen-subsets of F because ¥ is barely overlapping. If 
A is an cigen-subset, it contributes |; if B is, then B contributes (m — (k — 1))/ > 
1. Since there are (*', ') such partitions of /, the inequality follows. LI 


Proof of Theorem 14.1 (continued). Using the claim, it is easy to show that 


a ae ea G)= 2X wl, ay=(7). 


FEF Get) 


»1Fl<(7)/(; '), as desired. In case of Pcbaged equality must hold in (i). 
Thes all G E(/) are eigen-subsets. That is, ¥ is a partial /-design. Consequently, 
[Fl =()/(7') if and only if ¥ is an S(n, 21 — 1,1). O 


For further results and problems on barely overlapping and related families we 
refer to Frankl (1988). 

We call ¥ C2* union-free if F U F' = GUG’ implies for F, F', G, G’ € F that 
{F, F'} = {G, G'}. Let u(n, k) denote max |¥|, where ¥ C ({) is union-free. 


Theorem 14.2 (Frankl and Firedi 1986a). There are positive constants c,,c, such 
that 


RIB + ek) — 


2k/B+e(k) 
c,n ; 


u(n, k)<eyn 


with e(k)=0, 4 or % according to whether k=0, \ or 2 (mod 3). 


Let us mention that the proof of the lower bound is rather involved. The 
acquired family ¥ is defined via systems of nonlinear equations over finite fields. 

Again, for more information on this and related problems we refer to Frankl 
(1988). 

We call ¥ disjoint-union-frec if it contains no four sets F,G, 11, K with 
FUG=HUK and FONG=HNK =. Let u,(n, k) denote the maximum size of 
¥ C(4), F disjoint-union-free. 

Clearly, u,(n,k)=(724)+1; (take {GEM): 1EG)}U {[2,k 4+ 1]}). It is 
possible that for k =4, 1 >1n,(k), equality holds. However, it was unknown for 
many years whether us(n, k) = O(n) held. Furedi (1983) gave an ingenious 
argument to show the following. 


Theorem 14.3. u,(n,k)<3(,",) for alln>k =3. 
Let us note that in the case of graphs (k = 2), the condition is that the graph is 


C,-free and u,(n, 2) is of the order n°” (cf. chapter 23). 
The paper by Frankl and Firedi (1987) gives a rather general treatment (and 
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often solutions) of a class of excluded-configuration-type problems for k-graphs. 
We mention just a few of the results of that paper. 

Let o(a, b) be the maximum size of an a-graph without sunflowers of size b. 
Also, let #(n,k,1,s).be the maximum size of ¥ C(*%), where ¥ contains no 
sunflower of size s whose center has size 1. 


Theorem 14.4. d(n,k, 1,5) =(f(1 + 1,5) + 00) 2f2t) fk >24+1. 


It is conjectured that the same holds for k = 2/ + 1 as well; however, this has 
only been proved (in Chung and Frank! 1987) for /= 1. For k <2/+ 1 it follows 
from Theorem 7.1 via Lemma 4.15 that $(n, k,/,s) has order n'; however, the 
correct coefficient of n' is unknown. 


Conjecture 14.5. 


wntshre (EEE) cen) () [EP EP), 


The construction is given by taking ¥=a,(/), where SC(,.,.%u-1) is a 
(partial) i-design. 

Let be a k-graph. Set p =| | and let g be the number of vertices of of 
degree 2 or more. 


Theorem 14.6. If 2p +q+1<k, then ex(n, A) = (yo) — OCL))G 28-1), where 
y(#) is a positive integer depending only on 4. 


In the case p = 0, one can define y(#) by taking y(@) + 1 to be the size of the 
smallest set T satisfying |TM Aj =1 for all A E9f. Note that such a T exists if of 
is k-partite, which - in turn — follows from q<k. In general, y(%)<(p +1, 


| s@|). 


Theorem 14.7. Set # = {{1,2,3,5,7}, {1,2,3, 6,8}, {1,2,4,5,8}}. Then ex(n, f) 
= o(n*). However, lim,_,, ex(n, #)/n" = & for all a <4. 


This result shows that ex(n, «) does not always have a proper exponent. The 
proof extends that of Ruzsa and Szemerédi (1978), where a similar phenomenon 
is described. 


Another type of extremal problem, considered by Kaszonyi and Tuza (1986), is 
the following. 


Definition 14.8. For a k-graph 3, let sat(n, o) denote min |¥|, where ¥ C (!21), 


and ¥ contains no copy of », but adding any new k-subset of [n] produces a copy 
of of. 


Conjecture 14.9 (Tuza). sat(n, ) = O(n*~') for every k-graph of. 
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There are many aspects of combinatorics which are mirrored by chapters in this 
Handbook. But few areas form such a compact body of concepts and results (and 
thus in turn form a theory in the classical sense) as Ramsey theory. When 
presenting it one could follow the axiom—definition—theorem—corollary scheme. 

We decided otherwise and we follow roughly an historical account commented 
from a contemporary point of view. It seems to us that this is the most 
appropriate method for the subject. 

Our aim is to give a survey of the recent development of Ramsey theory and 
include proofs of some of the main results. Recently found techniques enable us 
to do so. 

Very few areas of combinatorics display such a variety of techniques from 
various parts of mathematics. This should not be seen as a surprise. Many results 
of Ramsey theory (including Ramsey’s theorem itself) have a character of a 
combinatorial principle which may be viewed as a structural generalization of the 
pigeon-hole principle. Often such principles mirror profound features of objects 
studied in various areas of mathematics. 

The first version of this paper was written in 1988 and since then it has been 
updated several times. It is advantageous that we can build upon several books 
and surveys which are devoted either to various aspects of Ramsey theory (such 
as Graham 1981, Graham et al. 1980, NeSetril and R6dl 1978a, 1979b, 1990a, 
Rod] 1991, Prémel and Voigt 1990) or contain chapters devoted to our subject 
(Bollobaés 1985, 1978, 1979, Alon and Spencer 1992). However, no previous 
knowledge of Ramsey theory is necessary. 

I wish to thank Hillel Furstenberg and Vojtech Rédl for valuable discussions 
and generous help. 


1. Ramsey’s theorem 
We start with the result which gave the subject its name. 


Finite Ramsey theorem 1.1 (Ramscy 1930). Let p, t, n be positive integers. Then 
there exists a positive integer N with the following property: 


If X is a set with at least N elements and a,U +--+ Ua, is any partition of 
the set ( *) of all p-element subsets of X, then there exists a subset Y of X 
with at least n elements such that the set ( J is a subset of one of the 
classes a, of the partition. (1.1) 


Admittedly this is a technical statement and several helpful notions and 
convenient notations will be introduced. Moreover, it is a characteristic of 
Ramsey’s theorem (and Ramsey theory) that it often occurs hidden or in a 
different context as might be suggested by the following notions and notations: 

The set {1,..., 2} will be denoted by [n] and the set {Y: ¥Y CX and |Y|=p} 
by (*). This convenient notation originated in the context of Ramsey theory, and 
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it is due to Leeb (1973). The partition a, U --- Ua, (in the above statement of 
Ramsey's theorem) is called a coloring; it is sometimes written as c: CO) [t]. We 
shall see that the most important case is f= 2. The set Y is called a homogeneous 
set, or sometimes a monochromatic set. 

The validity of statement (1.1) (for particular values n, p, ¢ and N) is 
traditionally denoted by the symbol 


N>(n)? 


which is called the Erdés—Rado partition arrow (see Erdés 1987 or chapter 42). 
The smallest value of N for which N— (1)? holds will be denoted by r( p, t, n) and 
is called Ramsey number. 

As a warm up let us list some particular cases: It is easy to see that r(1, t,n) = 
t(n — 1) +1. This is the Anglo-American pigeonhole and continental Schubfach- 
(also Dirichict’s) principle. The case p = 2 and f= 2 is the most frequent case — 
the graph case: every graph with at least N vertices satisfies either a(G) =n or 
w(G)2n (1.c. every sufficiently large party contains either » mutual acquaint- 
ances or 7 mutual strangers). In this setting r(2, t, 3) — 1 is the maximum number 
of vertices in a complete graph which can be decomposed into 1 triangle-free 
graphs. Very very few Ramsey numbers are known exactly. We shall return to this 
aspect, which is perhaps the typical feature for the whole field, in section 3. 

Contemporary proofs of Ramsey's theorem do not differ much from the 
classical proofs. We give two proofs which are similar; nevertheless the mild 
formalism of the second extends to much more general situations. 


Proof I. We proceed by induction on p (for every choice of t, 1). The case 
p=t1 we discussed above. In the induction step let the existence of numbers 
r(p —1,t,#) be proved for every choice of t and n. Fix £>1 and n> p to avoid 
trivial cases. We prove the existence of r(p, t,n) by giving an upper bound for it. 


First put f= r(1,f,”) and fork =1,...,¢,—1 define numbers r, by 
roy =p ltir)yt 
with r, = 1. 


We claim that r(p,t,n)<r, =r. To prove this let X be a set of size r and let 
. CY) 1[4 be an arbitrary coloring. Without loss of generality assume that 

Shia} Erpcecding by backward induction for k =¢), 4,— 1,..., 1 define 
ae X=X, D--: DX,,|X,|=r,, and elements 1 =X oy2 0+ 5X] as follows: If X, 
is defined, ‘let x, be the minimal clement of X,. Put X; a {x,} and define 
the induced coloring c, of the set CG ‘) by ¢,(A) =c(A U Gly By the construc- 
tion of the number r, there exists a set X,.,CX, of size r,_, which is 
homogeneous (with respect to c,); denote such a set by X,_,. This definition 
guarantees that X, ~. Now consider the set X' = {x 15-.-5X,} and define 
the (induced) golarmng c': X' > {1,...,¢} by 


€'(xq) = (fx, } UA) 


f@ Xn 
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where A is any (p — 1)-clement subset of {x, ,,...,x,}. (The above backward 
induction just guarantees that the coloring c’ is well-defined.) Finally, using the 
choice of t,, let Y CX’ be a c’-homogencous set with 2 points. We claim that Y is 
homogeneous. However, this is clear: for if P, P’ E(}) then 


c(P) = c'(min P) =c'(min P’) =c(P). 0 


Proof H. Let us recast the above proof of Ramsey’s theorem in different 
language: By a set we mean now a set of positive integers. The beginning of a set 
is its minimal clement. If X is a set then a coloring of (7) is said to be good if any 
two sets with the same beginning have the same color. 

Ramsey’s theorem follows immediately from the following two claims. 


Claim A (Sufficiency of good colorings). The following two statements are 
equivalent (for a fixed p and t): 

(1) For any positive integer n there exists a positive integer N with the following 
property: If X is a set with at least N elements together with an arbitrary t-coloring 
of ( *) then there exists an n-subset Y of X such that the coloring restricted to ( . ) is 
good; we denote this by 

N saayr 5 


good 


(2) For any positive integer n there exists N such that N->(n)?. 


Claim B (Existence of good colorings). For every n there exists N such that 
N —- (n)?. 


good 

Proof of Claim A is easy since any set Y of size (—1)+1 for which the 
coloring of () is good contains an n-subset which is monochromatic. 

Claim B can be proved easily by induction on n using Ramsey’s theorem for 
p — 1: Namely if N >: then 


N=1+/7(p-1,6,N) cn eye: 
200 
One simply considers all p-scts which contain | and finds a set X C {2,..., N} of 
size N with all p-sets of form {1} UZ, Z€(,*,), monochromatic. O 


Admittedly Proofs I and II are similar with a few cosmetic changes. However, 
Ramsey’s theorem is a part of combinatorics where the choice of notions 
(‘language’) matters. Below we shall benefit from this formulation of Ramsey’s 
theorem by giving a strikingly similar proof of the Graham—Rothschild theorem 
(see section 4). 

F.P. Ramsey discovered this theorem in a sound mathematical context. Perhaps 
because of this, Ramsey’s theorem was never regarded as a puzzle and combina- 
torical curiosity. It’s beauty and power are now well established. However, it was 
largely through the efforts of P. Erdos that the subject enjoys the current high 
level of popularity and research activity. Erdés together with G. Szekeres 
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initiated the applications of Ramsey’s theorem in geometry by proving the 
following (Erdés and Szekeres 1935). 


Theorem 1.2. Let n be a positive integer. Then there exists an N with the following 
property: If X is a set of at least N points in the plane no three of which are 
collinear then X contains an n-tuple Y which forms the vertices of an convex n-gon. 


(Classical hint: Prove N =r(4, 5, n).) 

Apart from the problem of a good estimation of the optimal value of N (which 
is a common “difficulty” with all Ramsey-type results) there is a peculiar 
structural problem here: 

Call a set YC X an n-hole in X if Y is the set of vertices of a convex n-gon 
which does not contain other points of X. 


Problem 1.3. Does there exist V(n) such that if X is any set of at least W(7) 
points in the plane no three of which are collinear then X contains an n-hole? 


It is casy to prove that W(t) exists for n= 5 (see Hlarborth 1978); Horton 
(1983) showed that \(n) does not exist for n = 7. The existence of W(6) is open. 
See Valtr (1992) and NeSetfil and Valtr (1994) for a recent discussion of this 
problem. 

An easy variant of the above proofs of Ramsey’s theorem lead to the bound 


‘hom 


rp, k,n) a 


where c, is a positive constant and the stack of 2’s has height p — 1. In Ramsey 
theory we frequently meet very large numbers (and the theory is the source of 
very large cardinals, called Erdos cardinals; see (Erd6s et al. 1965, 1984a or 
chapter 42). To estimate some of these numbers is sometimes cumbersome which 
is in the present style resolved by saying ‘‘assume that N is sufficiently large’. The 
meaning of the phenomena of these “large” functions took recently an un- 
expected turn in the context of mathematical logic and the notions of primitive 
recursive functions and provably (total) functions invaded finite Ramsey theory. 
We shall touch this in section 3.3. 

Let us return for a moment to the origins of Ramsey theory. 

It has often been said that although Ramsey discovered the theorem in a 
meaningful mathematical context, the later development obscured this motivation 
in the favor of the combinatorial part (of his unique (!)) mathematical paper. As a 
result of this wide belief, the logical part of his paper is even omitted from a 
collection of his works (Ramsey 1978). But one never knows. Recently, exactly 
this part of Ramsey’s research found a very nice application. Let us briefly 
mention it. 

Ramsey’s paper contains a sophisticated application of “‘Ramsey’s theorem” to 
the decision problem for a class of first-order formulas. Explicitly, Ramsey proves 
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that if ~ is a first-order formula of the form 
Ax ,Ax, as ax,Vy, ets Vy, ,P ’ 


where ® is quantifier-free (such formulas are called Bernays—Schénfinkel for- 
mulas) then there exists an algorithm.which decides whether ¢ holds for every 
finite structure. This seemed to be a dead end project as (shortly after Ramscy’s 
death) it was shown (by Church) that the decision problem is algorithmically 
unsolvable, indeed, for first-order formulas (Trakhtenbrot 1950) and even for 
first-order formulas of the form 


dx,--- dx, Vy, Vy,,dz,°°-4dz,8 , 


p 


see Lewis (1979). 

However, the whole area got a new turn when Giebski et al. (1969) and, 
independently, Fagin (1976) showed that for every first-order formula g, the 
probability P(A, ¢) of the fact that a formula ¢ is satisfied by a structure A with 2 
points either tends to 1 or tends to 0 as n tends to infinity (see chapter 6). 

This result (and similar ones) is called a 0-1 /aw (here for first-order formulas). 
However, Glebski’s ct al. result does not cover all instances. In fact the most 
interesting properties are related to the properties described by the second-order 
formulas, e.g., Hamiltonicity is defined by a second-order formula: ‘there exists 
an ordering of the vertices which forms a cycle’; P (Hamiltonicity) tends to 1 by a 
result of Posa (1976), see chapter 6 of this volume. 

Efforts were made to extend the 0-1 law to classes of special second-order 
sentences. (For second-order formulas in general the problem is unsolvable.) 
Kolaitis and Vardi (1987) proved that the class of second-order formulas derived 
from Bernays—Schénfinkel formulas with only one existential second-order 
quantifier with its first-order part being Bernays—Sch6nfinkel, satisfies the 0-1 law 
and they determined the complexity of the corresponding decision problem. It is 
possible to say that their proof uses ideas of Ramsey’s original paper with only 
one “‘slight” change: one has to replace Ramsey’s theorem with a more modern 
device, namely the Structural Ramsey theorem, proved by NeSetril and R6dl in 
(1977a), see section 5. 

It is amazing how persistent the original motivations are. 

Since this introductory section is devoted mostly to classical papers, let us 
mention that the earlier of the widely known thcorems of Ramsey-type is the 
following result due to Schur. 


Theorem 1.4 (Schur 1916). For every t there exists a positive integer N such that 
for every partition of the set {1,...,N} into t classes, one of the classes contains 
numbers x and y together with their sum x + y. 


Proof. Schur’s theorem easily follows from Ramsey’s theorem. Take N= 
r(2,t,3). Let c: 1,...,N}—> {1,...,0} be a fixed coloring. Define then the 
coloring c’ of pairs by c’({i, j})=c(|i— jl). By the choice of N there exists a 
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c’-monochromatic triangle with vertices i<j<k. However, then c(i — j) = c(k — 
J)=c(k —i) and (j-i+(kK-sPHk-h O 


It is fair to say that Schur has been a hidden force behind the development of 
Ramsey theory in the thirties: he was the first one to formulate the conjecture 
which Van der Waerden turned into theorem (see Van der Waerden 1927 and the 
next section). His student R. Rado soon thereafter in 1933 obtained one of the 
deepest results of Ramsey theory (see Rado 1933). 

The motivation for Schur’s research in this area was algebraic (p-adic versions 
of Fermat’s problem and the distribution of quadratic residues). 


2. Adventures of arithmetic progressions 


Besides Ramsey’s theorem itself the following result provided constant motivation 
for the theory. 


Theorem 2.1 (Van der Waerden 1927). For every choice of positive integer t and n 
there exits N such that for every partition of the set {1,2,..., N} into t classes one 
of the classes contains an arithmetic progression with n terms. 


The original proof of Van der Waerden (which arose in a discussion with Artin 
and Schreier, see Van der Waerden (1971) for the account of the discovery) and 
which is included in the enchanting and moving book of Chintschin (1951), was 
until recently principally the only known proof. However, interesting modi- 
fications of the proof were also found (see, e.g., Graham and Rothschild 1974, 
Deuber 1982, and Taylor 1982). Most important of these is probably the 
combinatorial formulation of the Van der Waerden result found by Hales and 
Jewett (1963). 


Theorem 2.2 (Cube theorem; Hales—Jewett theorem). Let A be-a finite set 
(alphabet) and let t be a positive integer. Then there exists N with the following 


property: For every partition of the cube A™ into t classes one of the classes 
contains a combinatorial line. 


Here we think of A” as the set of all vectors (x,,...,Xy) with each of its 
entries belonging to A (i.e., an N-dimensional cube over A). 
A combinatorial line is a set of points of A™ of the following form: 


{((x,,....%y)i x, =x; for i, El, x, =x? for i 1}, 
where (x{,...,x,) is a fixed point of A‘ and / is a non-empty subset of 
(1,...,.NJ. 


The Hales—Jewett theorem readily implies Van der Waerden’s theorem: If we 
put A={0,1,...,2— 1} and with a point x=(x,,...,x,y), We associate an 


a 


Ramsey theory 
integer 
N 
w(x) = yy X,0n 
i=t 


then the mapping w: A” —> Y is 1-1 and it is easy to see that every combinatorial 
line is mapped to an arithmetic progression of length n. 

On the other hand, as mentioned earlier the original proof of the Hales—Jewett 
theorem is closely related to Van der Waerden’s proof and in fact it may be viewed 
as its combinatorial axiomatization. 

However, the distinctive feature of both proofs is that one has to prove a more 
general statement and then to use a double induction. This procedure does not 
provide a primitive recursive upper bound for the size of N: Denote by W(n) the 
minimal number N which satisfies to Van der Waerden’s theorem (for t= 2). 
Known values are W(2) = 3, W(3) =9, W(4) = 35, W(S) = 178. The upper bound 
for W(n) supplied by the original proofs grows like A(m), where A is the 
Ackermann function. The Ackermann function may be defined by the following 
procedure. 


Ackermann hierarchy 2.3. For each positive integer n, define the function 


f,:NON 
as follows: 

A@=itt, 

LG) =2-2, 


Fag if) 1,278 ° C1) 


i 


(Thus f,() = 2‘ and f,(i) is a stack of 2’s of height i (the tower function).) Thus, 
Ramsey numbers r(p, k,n) are bounded by the function f,. 
The Ackermann function A is the diagonal function 


A(n)=f,(”) . 


(A fails to be primitive recursive; it cannot be expressed by a combination of 
the usual function operations). 

On the other hand, the best lower bound (for # prime) is (only!) W(n + 1) > 
n2" (due to Berlekamp 1968). Thus, the question of whether such a huge upper 
bound was necessary, was one of the main research problems in this area. 

This feeling was not new and already Erdés and Turan (1936), for the purposes 
of improving the estimates for W(n), got the idca of trying to prove a stronger 
(now called a density) statement: Denote by r,(7) the smallest r such that an 
arbitrary sequence 


lsa,<-+--<a,<n 
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must contain an arithmetic progression of length k. Clearly a good upper bound 
on the function r,(7) implics Van der Wacrden’s theorem and it may yield a better 
bound on W(n). 

However, this approach proved to be very difficult and so far did not contribute 
to the bounding of W(t). Yet it proved to be a very rich source of problems and 
results. Even the basic question whether 


r,(n) = o(n) 


appeared to be very difficult. Let us remark that because of the (clear) 
subadditivity 


r,(m tn) <r,(m) +7r,(n) 
the limit 
limr,(n)/n—> r, 


exists. It was a stunning achievement that the problem was solved affirmatively by 
Szemerédi (1975) fifty years later. We state his result in the finite form as follows. 


Theorem 2.4 (Szemerédi’s theorem). For every —« >0 and positive integer n, there 
exists N such that every subset X of {1,2,...,N)} of size at least eN contains an 
arithmetic progression with n terms. 


Szemerédi’s proof is purely combinatorial and very complicated. Several pieces 
proved to. be useful in other contexts, most notably, his so-called Regularity 
lemma (Szemerédi 1976), see chapter 23. 

Szemerédi’s regularity lemma has many beautiful applications to Ramsey 
theory, see, e.g., Nesetril and Rod! (1979b), Chvatal et al. (1983) or Chen and 
Schelp (4993) and recent Ajtai et al. (1994) or Erdés et al. (1995). We shall 
mention these on several places below. However a great achievement, it is true 
that Szemerédi’s proof is a mathematical four de force. Thus great excitement was 
caused 15 years ago by a new proof of Szemerédi’s theorem by means of ergodic 
theory. This has been found by Fiirstenberg (1977), see his monography 
(Fiirstenberg 1981). It is beyond the scope of this chapter to describe these 
methods. Let us just add a few remarks. 

Ergodic theorems deal with infinite structures (and yicld the finite Ramsey type 
theorems). In order to start this work one has to properly generalize Ramsey type 
statements to infinite sets. This is straightforward for the Ramsey theorem itself 
(see Theorem 3.4) but for number theoretical results less so. It seems that the 
topological dynamics provides a proper non-elementary setting for many Ramsey 
type (coloring) questions in combinatorial number theory by systematically using 
the global properties of structures (such as automorphisms of Z), see Fiirstenberg 
and Weiss (1978), Hindman (1979), Carlson and Simpson (1990) where this is 
made explicit (see also section 4.4). Ergodic methods profoundly extend and use 
the topological dynamics results to yield density results. This comment is 
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strengthened by a very recent development (the density version of the Hales— 
Jewett theorem, see section 4.3). The ergodic methods cnabled Furstenberg, 
Katznelson and Weiss to prove various results which were previously too complex 
or even inaccessible for combinatorial methods. For example one can prove both 
Van der Waerden’s theorem and Szemerédi's theorem for higher-dimensional 
lattices). Whereas Van der Waerden’s theorem for n-dimensional lattices is not 
difficult to derive from the 1-dimensional case (this is known as the Gallai—Witt 
theorem, see section 4.4) a similar generalization of the Szemerédi’s theorem has 
been proved first by ergodic theory means and no combinatorial proof is presently 
known. 

Another example are the so-called iterated density theorems: Observe that 
neither for Schur’s theorem nor for Ramsey’s theorem an analogy of Szemerédi’s 
theorem holds (for integers from n/2 to n form a sum-free sct, and a complete 
bipartite graph with n vertices may have a positive proportion of the cdges of the 
complete graph with n vertices and yet contains no triangle). However, given a set 
X of integers one may consider occurrences of these x for which there are 
“many” y such that x+y is also in the set. This led to the notion of iterated 
density theorem which was considered and initiated from the point of view of 
topological dynamics by Bergelson (1986), and Bergelson and Hindman (1988), 
and from the combinatorial point of view in a series of papers by Frankl et al. 
(1988). (However, combinatorial methods, sometimes seem to yield stronger 
results here.) 

One last comment on topological dynamics methods: The basic notion in 
transforming combinatorial number theory to topological dynamics theory is that 
of a dynamical system which is a compact metric space Y together with a 
homeomorphism T: Y— Y. In combinatorial applications Y is induced by (say) 
2-colorings of the set Z of all integers and 7 is the shift operator defined by the 
shift of the coloring to the right: (7x)(i) = x(i + 1). 

All topological dynamics versions of combinatorial number-theoretical results 
involve this interpretation using shift operators and perhaps this led Bergelson to 
formulate the following conjecture: Let H = (Z, ) be a hypergraph such that the 
shift i>i+1 is an automorphism of (Z,.). Suppose the chromatic number is 
infinite. Denote by a, the maximal independent set contained in the subhyper- 
graph induced by the set {1,...,2}. Then a,,/n tends to 0 (as n tends to infinity). 

If true this conjecture would imply Szcmerédi’s theorem. However, it was 
disproved by Ruzsa and, independently, by Kriz (1987) who disproved it even for 
graphs. 

All of the above mentioned methods are existential in nature and they do not 
produce any bounds for the Van der Waerden numbers W(n). The problem of the 
upper bound for the function W() was generally felt as one of the main problems 
of the field and the situation raised speculation relating the problem to the 
hierarchies of rapidly growing functions (upon which we shall comment in section 
3). Thus, the work of Shelah in this area had the effect of a bombshell. Shelah 
(1988) found a new proof of Van der Waerden’s theorem and even the Hales— 
Jewett theorem which avoids use of double induction and yields a primitive 
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recursive upper bound. The method of the proof is remarkably simple and we 
give it here: 
In the proof we shall use the following technical lemma. 


Shelah’s pigeonhole lemma 2.5. For all positive integers n and t, there exists m with 
the following property: For every choice of colorings c,.m?""'—>{t|, l= 
1,2,...,n, there are integers 1 <a,<b,<m such that for every [=1,...,m, we 
have 


C5 yy op Op Ops Ap ats Pyagy s+ + 5 Any On) 
= (4), By. 5 ys Oy. Oy, yy. Oy, 5%, O,)- (2.1) 


Denote by f(n,t) the minimal such m. 


Proof. We proceed by induction on n. Obviously f(1, 0 =t+ 1. For the inductive 
step we prove 


fn + 1, t) < prone Sen. (2.2) 


Put M=f(n,t). Let ¢:[NP"'' [tq d=1,...,a4+1, be arbitrary colorings. 
Consider the induced coloring 


c': [NJ> [qh 
defined by 
Ca) CA hs i BS OO eo Is 
By the pigeonhole principle there are 1=<a,,,<b,,,=N with 
"a, ) = C'(Py41)- (2.3) 
Now we can define t-colorings c;: [M]"""'—>[], f=1,...,2, by 
CX pa Xn) HON. 2 Xan te Apa ys Onay) - 
By the induction hypothesis there are numbers 
a,<b,,a,<b,,...,a,<b, 
such that for every /=1,...,, we have 


u 
CG Dye = Apps Opty Mt Bras Brass = 6 Ans On) 


_ 

= CG, Dye Mpg Op Ops Qin sO © + + 1 Ens On) 

= C(0,, B45 - ~~ Oy By 45 By, Aya ys Drags + 9 Ans Ons Anais Onay) 
= (0, Dy... Qp ip Ops yy Bp ys Opts = Us Ons Anais Onay) - 


But (2.3) implies 


Cy 410, Oe, ~~ Gyr Bas Oya) = Cp41 (41, 04,.--54 


bas bast) 


n? 
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which all together give (2.1). 0 


Proof of the Hales—Jewett theorem. Denote by HJ(,¢) the minimal number N 
for which the statement of the Hales—Jewett theorem (for A = [n] and ¢ colors) 
holds. 


Obviously HJ(1, ¢) = 1 and by induction on n we prove 
Hi(n +1, <HJ(n, 0)- f(HI(n, 2), ee”) | (2.4) 


where f is the function occurring in Shelah’s pigeonhole lemma 2. > 

To simplify the notation, set N=HJ(n.t) and m= han i") (Thus the 
desired upper bound (2.4) is N-nt.) Also set M,= {ml+1,...,m(l+1)}, for 
1=1,...,N. 

Now let c: {a + 1] — [ft] be a fixed coloring. 

In fact, it suffices to consider a coloring of a subset of [2 + 1]“" formed by all 
cascade functions: A cascade function f determined by a family (a,, 5): l= 
1, ,N), a, <b,, a), b, E M,, and a function g € [n + 1]” is a function belonging 
to In +1]*" which cailisties 


n+1 fori<a,,i€M,, 
fi)=4a)  fora,<i<b,, 
n— forb,<i€M,. 


Nem 


(a,, b) P=), ,N) is called the schema S of the cascade function f. 
For a fixed schema S the mapping g+>/f is a 1-1 mapping which carries a line in 


[n + 1]” into a line in [n + 1]*’". We put f = H,(g). (Sce the schematic fig. 2.1.) 
Let us define colorings 


d,: [m)"'—> (ff: [n+ 1)" [q} 
by 
d(a,,b,,.- 6 @y_y, by -\5@, = b,, a). 4, Op44,--- 5 ys Oy) 
= (CA ya,.n)(8))s & Ela + TY) (2.5) 


(if (a,, b,) does not correspond to a schema of a cascade function then we define d, 
arbitrarily). 


g Ag(9) 
Figure 2.1. 
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By Shelah’s pigeonhole lemma there exists a schema § 


S=(a,<b,,a,<b,,...,ay<by) 


such that (2.1) holds; explicitly, for every /=1,...,N, 
Ay (dys yy 6 Gy By 15 Ms Apa ys Byg tye ys On) 
= (dys Oy, 6 yy Oy By Gy, Ona» Any On) - (2.6) 


Let us consider all cascade functions with schema S, i.e., all functions H,(g), 
gé[nt+1]”. By the choice N=HJ(n,t), there exists IC{N] and x’ = 
(x?,...,x)) such that if we denote by L the line in [x] determined by x° and J 
then H,(L) is a monochromatic subset of [n + 1]. 

Let the points (functions) of the line L be denoted by x',...,x". Observe that 
all the cascade functions Hy(x'), HyQ?),...,HsQr") have schema S. Define 
ytt oe vee 1. aD by 


ese fori € 1, 
‘ n+1 fori€l. 


at) 


The cascade functions H,(x"") has schema S$. However, both cascade functions 
H,(x") and H,(x"*') may be also thought of as having any of the following 
schemas ((a;,b;): /=1,...,N) where a; =a,,b, =6,, for 11, and 


a,,b; € {a,,b,}, a, <b, forlel. 
From this follows (by repeated use of (2.5) and (2.6)) that 
c(Hs(x")) = c(Hs(x"*")) 


atl 


and thus x', x7,...,x""! is a monochromatic line in [n + 1]%". 


Corollary 2.6. 
HJ(a,) <f(c-(n +0), 


where f, is the function introduced in the Ackerman hierarchy. 


Proof. The function f(n,#) has a growth of the tower function which in our 
notation is the function f,. The bound on HJ(n, 1) then involves the iteration of 
J(n,t) as the principal term which gives the function f,. £1 


This corollary justifies the subtitle of Shelah’s preprint: indeed “from the 
Ackerman sphere into the atmosphere”; well, perhaps, stratosphere. 

In the spirit of P. Erdés, R.L. Graham recently offered $1000 for the proof of 
an f,-type upper bound for the Van der Waerden function W(n). For a very 
compact write up of Shelah’s proof, see Nilli (1990). Another version appears in 
the second edition of Graham ct al. (1980). 

Let us add one remark. The Hales—Jewett theorem found an effective use in 
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many proofs of structural extensions of Ramsey’s theorem (see section 6). Thus 
we have a primitive recursive bound for all these results. Particularly, this is true 
for Ramsey’s theorem for set systems (or structural Ramsey theorem) and 
Ramsey’s theorem for space systems which we state later in section 5. However, 
the strongest results in this area fail to fall presently into this category since their 
proofs use double induction. 

To end this chapter let us return once again to the Van der Waerden theorem 
from a more number-theoretical point of view: 

The most frequently studied case of the Erdés and Turan numbers r,() is, of 
course, k = 3. 

Here there are known much stronger results and they are reviewed in chapter 
20. Let us complement this by the following two remarks. 


Remark 2.7. Ruzsa and Szemerédi (1978) related the proof r,(m) = o(v) (i-c., the 
upper bound) to a purely combinatorial problem. 


Lemma 2.8. Let (X, 4) be a triple system which contains on any set of 6 elements 
at most 3 triples. Then 


[at = o( |X|’). 


From this Ruzsa and Szemerédi (1978) derived an o() bound for r;(7) quite 
easily (see also Erd6s et al. 1986). There is a similar combinatorial statement 
which is known to imply Szemerédi’s theorem. However, this is still a conjecture 
at present. 


Remark 2.9. The best lower bound for r,(m) is a classical result of Behrend 
(1946) and let us note that even the earlier weaker result of Salem and Spencer 
(1942), r,(n) >n'~* for a positive « >0 recently found a surprising application in 
a least expected area, namely in the fast multiplication of matrices. 

Given two n Xn matrices A = (a,,), B = (b,,), one computes their product 


A: B=(c,) 


by applying the definition c,; = iD a,,b,, by means of O(n*) multiplications and 
additions. To a great surprise Strassen in his classical paper suggested a new 
method which consisted in breaking the matrices into blocks and recursively 
performing operations according to a complex pattern. The basis is his famous 
table for multiplying 2 x 2 matrices. This yielded a total number of operations 
O(n’). Since then there have been numerous improvements of this result by 
Schénhage, Strassen, Coppersmith and Winograd. The best result so far was 
recently obtained by Coppersmith and Winograd (1990) who used the O(n' *) 
lower bound for r,(n) to get an O(n?*°) bound. The details are too complex to be 
presented here. Let us just remark that a dense sequence S without 3-term 
arithmetic progressions is used for a suitable indexing (hashing) of the very 
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complex schema. Since one needs this for a fixed (very large) 1, it does not matter 
whether one has a good algorithm for producing such a set S. 

From the point of view of combinatorics, computer science has been a blessing. 
Nearly everything which has been studied over the years has found applications in 
some areas of computer science. In a sense, combinatorics has become a sort of 
“set theory” for computer science. 


3. Some bounds 


Perhaps the first question which one is tempted to consider is the problem of the 
actual size of a set which guarantees the validity of Ramsey’s (and Ramsey type) 
theorem. One should try to resist this temptation since it is well known that 
Ramsey numbers are difficult to determine and even good asymptotic estimates 
are difficult to find (and improve). 

We have already mentioned this in connection with the Van der Waerden 
numbers W(n). For Ramsey numbers the situation is not as dramatic: It follows 
from Erd6és and Rado (1950) and Erdos et al. (1965) that both upper and lower 
bound for r(p,f,n) are of the form 


t,(c, ‘ n) > 


2) 
where ¢,(x) is the two-variables version of the tower function f,: ¢,(x) = 
(there are p— 1 2's in the stack. See Duffus et al. (1995) for recent improve- 
ments.) However, for the most important case 1=2, the Ramsey number 
r(p, 2,n) is bounded from below by the function ¢, _, only. The question whether 
r(3, 2, n) has an exponential lower bound belongs to the outstanding problems of 
the area. The estimates of Ramsey numbers form a rich spectrum of results. 
Below we state some of them. 

Let us mention that this part of this chapter is expository since there are at least 
two recent surveys devoted to this area: the paper by Chung and Grinstead (1983) 
which concentrates on the exact results and the extensive paper by Graham and 
Rod! (1987) which concentrates mainly on the asymptotic bounds for most of the 
Ramsey-type results. 

The progress has been rapid recently, and V. Rédl together with his coauthors 
systematically investigated asymptotic bounds for various Ramsey-related prob- 
lems, see, e.g., Lefmann and Rédl (1993, 1995), Rédl and Rucinski (1993, 1995), 
Alon and Spencer (1992), Chen and Schetp (1993). 

Recall that we denoted by r(p,¢,n) the minimum N which satisfies N— ()?. 
Some special cases have a customary notation, ¢.g., r(a) = (2, 2,n) and r(m, n) 
denotes the minimal number N for which every graph G with N vertices either 
satisfies w(G) =m or a(G) 2n (thus r(n, n) =r(n); r(m,n) is the corresponding 
off-diagonal number’). All known exact values of r(m,n) are given in the 
following table: 
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m a Se B.A BB Bethe 24 
n 9-4 oS 6 OT RED aS 
rm,n) 6 9 14 18 23 28 36 18 25 


This is a great progress in this otherwise seemingly untractable problem: Since 
the first version of this paper two more numbers r(4,5) and r(3,8) were 
determined together with the first “hypergraph” Ramsey number r(3, 2, 4). 
r(3, 8) = 28 is determined by McKay and Zhang (1992), r(3, 2,4) = 13 in McKay 
and Radziszowski (1991), r(4, 5) = 25 was announced by McKay and Radziszow- 
ski in March 1993. They started a massive computer-aided search and obtained 
strong improvements of bounds. For example for the much studied case r(5) the 
current bounds are tight 43 < r(5) = 49 (completed in June 1993). See Radziszow- 
ski (1993) for a list of recent results. 

However, the problem of determination of exact values of r(m, n) seems to be 
intractable in general. Perhaps motivated by this Chvatal and Harary in a series of 
papers (sce, e.g., Chvatal and Harary 1972) suggested studying “generalized 
Ramsey numbers” defined as follows: r(G,H) is the minimum AN with the 
following property: 

If the edges of K, are colored by blue and red then either there exists a blue 
subgraph isomorphic to G or a red subgraph isomorphic to H. Also set r(G) = 
(G, G). Thus r(K,,, K,) =r(m,n). However, the number r(G, H) seems to be 
easier to determine especially if one of the graphs G and H is a sparse graph. Let 
us list some particularly elegant examples: 

(1) r(T,,, K,) = (m ~ 1)(a — 1) + 1 (Chvatal 1977) (here T,, is a fixed tree with 
m vertices). 

(2) r(n- K,) =5-n for n 22 (Burr et al. 1975) (here n- K, is the disjoint union 
of n triangles; of course for n = 1 we have a different formula). 

(3) r(G,) = [(4n — 1)/3] for every connected graph with n vertices and for 
every n =3 there are graphs which prove that this inequality is sharp (Burr and 
Erdés 1976). 

These results were generalized and a great variety of related results was 
obtained. See, e.g., Burr and Erdés (1975), Erdés et al. (1985, 1989, 1995), 
Faudree and Simonorits (1992), Erd6és and Faudree (1992), Chvatal et al. (1983), 
Sidorenko (1991, 1994). We want to mention the following two striking results 
(both relying heavily on the Szemerédi’s (1976) regularity lemma. 


Theorem (Chvatal et al. 1983). For every k there exists constant c, such that 
r(G)<c, +n for every graph G with n vertices and maximal degree <k. 


Theorem (Chen and Schelp 1993). There exists a constant c such that r(G)<c-n 
for every planar graph G with n vertices. 


Very recently Rodi and Thomas (1995) extended this result to graphs not 
containing a subdivision of K,. 

These results are partial results towards a solution of the following conjecture 
of Burr and Erdés (1975). 
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Conjecture. For every constant & there exists a constant c(k) such that r(G) < 
c(k)-n for every graph G with n vertices and at most k-n edges. 


Let us mention another recent striking conjecture due to Loebl: 
Conjecture. r(T,,) <2 for every tree T with n vertices. 


If one considers the distribution of papers within Ramsey theory then probably 
it is fair to guess that the number of papers devoted to gencralized Ramscy 
numbers has the highest frequency. It is only fitting that Burr, Faudree and 
Schelp are planning to prepare a monograph devoted to this subject. 

Let us return to the classical Ramsey numbers. We shall list a sample of 
asymptotic results. 


3.1. Upper asymptotics 


Generally, there is a big difference between proof of lower and upper bounds for 
Ramsey numbers. The upper bounds yield a proof of Ramsey’s theorem itself. 
There are not many variations here and consequently all upper bounds are 


inductive proofs. (Particularly there is no probabilistic upper bound.) Let us be 
more specific. 


The classical Erdos and Szckeres (1935) bound 


mtn 
rmt+int+1)s a ) 


is a consequence of the recursion 
r(m+iint1)<rimnt+i)t+rimt+i,n) 
=(r(m,nt+1)-—f+(rOn+1in)-—1) t+ 1)+t 


and the induction step is indicated in fig. 3.1. 


alt alt 


blue - red va 


rim,nel) rim1n]} 


Figure 3.1. 


The Erd6és—Szekeres bound has been improved several times but only after fifty 
years was it shown that r(m + 1,” +1)=O((";")), which was proved indepen- 
dently by R6édl and Thomason. Their bounds are the following: 

(1) r(m+1,n4+1)<c(""")/(log(m +n))° for positive constants ¢ and c’ 
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all ali 
blue red 


Figure 3.2. 


(R6d! 1986a). See Graham and Rédl (1987) for the following weaker result: 


r(m+ivn+1)<= cl Gua ”) Nog log(m +n). 

(2) r(n+1,n+1)<(C")/Vn (Thomason 1987, 1988). 

The basis of both proofs is similar in that they systematically use the following 
recursion (Walker 1971). 

We get either a blue K,, or a red K, if either there exists a blue edge contained 
in rm —1,n+ 1) blue triangles or a red edge contained in r(m+1,n—1) red 
triangles. See fig. 3.2. 

This approach is convenient since the number of monochromatic triangles is 
estimated very accurately by the following result of Goodman (1959) which is 
interesting in its own right. 


Theorem 3.1. Denote by k,(G) the number of triangles which are contained in the 
graph G. Let G be a graph with n vertices and m = p(,) edges. Then 


= ; EAL n 
c(G) + eG) = (p" + (1 py’) (3) - eC -p)G), (3.1) 
where G is the complement of the graph G. 
Proof. Let d,,.... d, be the degrees of the vertices of G. We have Yi d, = p(3) 
and also 
1 + Yd 
> @>dd,- a 


Observe that d;-(n ~ 1 —d;,) is the number of triples containing a given vertex x 
of degree d,; which fail to form a triangle in both G and G regardless of whether 
the edge not containing x belongs to G or G. Summarizing, we get 


c(G) +¢,(G)= (5) -4 2 am-1~d)) 
=(3)-*4 D443 Da? 
=1{n(n — 1)(n — 2) ~ 3pn(n — 1) + 3p?n(n — 1)’] 
which gives (3.1). 0 
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(3.1) implies that every graph G with n vertices satisfies 


c(G) + c,(G) >} (5 5) +0((3)). 


This is the expected cstimate which we obtain for c,(G) of the random graph G, 
(from 8 graphs on 3 points we are interested in exactly two of them). _ 

More generally, for a fixed /=2, denote by ¢(G) the number of complete 
subgraphs of order / in G and put 


cn) = min{(c,(G) + ¢,( G)yIF , 


where the minimum is taken over all graphs G with n vertices. 

Ramsey's theorem implies that c,(n) > 0 providing 7 is sufficiently large. One 
can also easily observe that c,(2) is an increasing function and thus Iet c, denate 
the lim,_.,,. c(77). Erdés (1962) conjectured that c, = 2/2") for every 1 >2. 

This is true for /=2 (trivially) and for /=3 by virtue of the above result of 
Goodman. 

Let us remark that this conjecture reflects again the value expected from a 
random coloring of K,. However, the Erdés conjecture has been disproved for 
every 1 >3 by Thomason (1987). 

Let us remark that some off-diagonal numbers have been estimated rather 
efficiently. Perhaps the most important result is the following bound due to 
Shearer (1983) who improved and exploited a fundamental method of Ajtai et al. 
(1980). 


Theorem 3.2. 


n 
AEs log(n/e) ° 
This bound and the method of its proof has found many applications (see, e.g., 
Ajtai et al. 1981a,b and Komlés et al. 1981). 


3.2. Lower asymptotics 


Lower bounds use different methods. 

In proving r(n) > N we have to exhibit a graph G with N vertices (and N > n) 
such that neither G nor its complement contains K,. The search for G is difficult 
and the best results are usually obtained by probabilistic methods. In fact it is 
possible to say that the problem of estimating lower bounds for Ramsey numbers 
has been the cradle of the probabilistic method in combinatorics. Since this is 
covered in chapter 33, we only add a few remarks. The best available exponential 
lower bound for r(7) is 


r(n)> (1+ oI) ana" 


n/2 


However, the search for an explicit graph of size (say) 2° which would 
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demonstrate this lower bound has been so far unsuccessful. This is not an entirely 
satisfactory situation since it is believed that such graphs share many propertics 
with random graphs and thus they could be good candidates for various lower 
bounds, for example, in theoretical computer science for lower bounds for various 
measures of complexity. (See the papers Chung et al. 1989 and Thomason 1987 
which discuss properties of pseudo- and quasi-random graphs also see chapter 6.) 
This use of Ramsey’s theorem (and Ramsey-type statements) is documented, e.g., 


“by papers of Vilfan (1976), Pudlak (1984), Alon (1986). An interesting applica- 


tion of Ramsey’s theorem is in a paper of Grincuk (1989) where he uses the 
infinite Ramsey theorem to prove a nonlinear lower bound to the size of switching 
circuits computing symmetric Boolean functions. 

The quasi-random property of graphs involved in the lower bounds for Ramsey 
numbers was studied in the broader context by a number of authors (Erd6és and 
Hajnal 1977, .NeSetiil and R6dl 1979b, Rédl 1986b, Thomason 1987, Chung et al. 
1989, Chung and Graham 1991). This may be viewed also as a part of the recent 
trend to derandomize (see Alon and Spencer 1992) the non-constructive proofs. 
As mentioned earlier, apart from the classical counting techniques mostly due to 
P. Erdés, in Ramsey theory other useful tools involve Lovasz’ local lemma (Erdés 
and Lovasz 1975) and Szemerédi’s regularity lemma (Szemerédi 1976) both of 
nonconstructive nature. These results recently found a constructive version (Alon 
et al. 1994, Beck 1991) see also an ad hoc argument in a Ramsey geometrical 
setting (Alon et al. 1995). 

The best constructive lower bound for Ramsey numbers r(n) is due to Frankl 
and Wilson. This improves on an earlier construction of Franki (1977) who found 
a first constructive superpolynomial lower bound. 

The construction of Frankl-Wilson graphs ts simple: 

Let p be a prime number, put q =p’. Define the graph G, = (V, E) as follows: 


V= (oy) =(FC(l,...,p}:|Fl=p-3j, 


(F, F') €F iff |FO F’|=—1 (mod q). 


The graph G,, has ( P) vertices. However, the Ramsey properties of the graph 
G, are not trivial to prove: It follows only from deep extremal set theory results 
due to Ray-Chaudhuri and Wilson (1975) and Frankl and Wilson (1981) that 
neither G, nor its complement contain K,, for n > (,"- *,). (Compare also a recent 
article by Frankl 1990 and chapter 24.) 

Let us finally remark that until very recently the best lower bound for r(n, 3) 
was O(n’/(log n’)) (Spencer 1977). Somewhat surprisingly Kim (1995) proved 
that the upper bound in Theorem 3.2 is tight up to a constant factor: 


Theorem 3.3. 


2 
r(n,3)=c 


n 
logn~ 
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r(n, 3) is the only non-trivial infinite family of (classical) Ramsey numbers with 
known asymptotics. 

The best constructive lower bound for Ramsey numbers r(3, n) is of order n°’’. 
This follows from a recent elegant construction given by Alon (1994), see also 
Chung et al. (1993). 


3.3. Even larger numbers 


The bounds on Ramsey numbers indicate that Ramsey’s theorem is a rather 
“non-effective” statement which involves a study of very large sets. In fact, this 
non-effective character of Ramsey’s theorem has been known for a long time and 
especially the following infinite Ramsey theorem (Ramsey 1930) has been studied 
from this point of view. 


Intinite Ramsey theorem 3.4, For all positive integers tand p and every partition of 


the set (*) of all p-element subsets of an infinite set X into t classes, one of the 
classes contains all p-element subsets of an infinite set Y. 


Symbolically we write w—> (w)?, w standing for the order type of the set N of all 
natural numbers. 


Proof. The proof of the Infinite Ramsey theorem is similar to the proof of the 
finite Ramsey theorem and, in fact, conceptually simpler: 

We proceed by induction on p. The case p=1 is the infinite pigeonhole 
principle: One of the classes of a finite partition of an infinite set has to be 
infinite. 

In the induction step let c: (1, }—>[#] be a given coloring. Define a coloring 
cy: (8) [e] by cy(A) = c(A U {0}). Using the induction hypothesis there 
exists an infinite set X, GN such that cl Cr) is a constant mapping; let i, denote 
the constant. Set X,, = X, — {min X,}, set x, = min X,, and define the coloring 


cA) ta bye(ay=e(AU )). 


There exists infinite X,C Xj such that c,|(%') is a constant denoted by é,. 


Continuing this way we construct elements x, =0<x,<x,<--- and numbers iy, 
i,, ,,... Define the coloring c’: (xy, x,, 42, ...}-> [t] by 
Peer 
c'(x)) =i, . 


Find infinite Y such that c’|, is a constant mapping and check that Y is a 
c-homogeneous set. [1 


Only recently has the impact of the Infinite Ramsey theorem on_ finite 

combinatorics been recognized. Let us be more precise and let us briefly describe 

' this interesting recent development on the border line of combinatorics and logic. 

The theory of finite sets is related to the most frequent axiomatization of the 
natural numbers known as Peano arithmetic (PA). 
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The Peano axioms consist of the following formulas which express the basic 
properties of arithmetic: 

(i) Wx(x +0 = x) 

(ii) WxWy(x + (y +1) = (& + y) +1) 

(iii) Vx(x -0) 

(iv) VxWy@ -(y +1) = (yy) +x) - 

(v) WxWy(x <y  Az((x +z) +1 =y) 

(vi) For any arithmetic formula O(x, y), it holds that 


Vx((O(x, 0) a Vy(OU, y) > OC, y +1) VyO, y)). 


(Here, a formula is arithmetic if it is built up from binary predicates +,-,< and 
constant symbols 0, 1 by logical connectives and quantifiers.) 

These axioms express properties of natural numbers and many (in fact, at the 
time of the creation of the system it was believed all) true statements about 
natural numbers seem to be possible to be deduced from them. 

Strictly speaking the formulas about natural numbers do not seem to capture 
properties of finite sets. However, one can encode finite sets by natural numbers 
and the finite set theory may be transformed into arithmetic. lt appears that PA is 
equivalent to the theory of finite sets. This means “usual” set theory together with 
the negation of the axiom of infinity. (This axiom may be expressed as follows: 
There is no set X which satisfies x EX > {x} EX.) 

Most of the finite combinatorics is carried out within the theory of finite sets. 
But not all, as is illustrated by the following. 


Theorem 3.5 (Paris and Harrington 1977). (1) Let p, t, n be positive integers. 
Then there exists an N such that N->(n)?. Here the (modified ) partition arrow —> 
has the following meaning: For every partition ('¥!) into t classes there exists a set 
XC([N] with the following properties: 

(i) C¢ ) belongs to one of the classes of the partition; 

(ii) lv|= =n; 

(iii) [Y| 2 min Y. 

(2) The statement (1) cannot be proved within the theory of finite sets (or 
deduced from the Peano axioms). 


Proof of (1). Suppose that (1) fails to be true for a particular choice of p, t, n. 
This means that for every N there exists a ‘tbad” coloring cy: (!4!)— [¢] such that 
at least one of the conditions (i), (ii) and (iii) is violated. Observe that a 
restriction of a bad coloring to a subset 1 Is again a bad SolOnne: Thus for every N 
we may find a bad coloring c™ such that c' isa restriction of c’ for i<j. Denote by 
c the coloring of CG) which extends all the colorings c’. Now invoke the Infinite 
Ramsey theorem: There exists an infinite set X = {x,<x,<---} such that all 
p-element subsets of X are monochromatic under c. Put r = max{x,,n}. Then the 
set Y= {x,,...,4%,} satisfies conditions (1), (2) and (3). C1] 
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Of course, (2) is the important part of the theorem of Paris and Harrington. 
Their original proof was motivated by a careful study of models of Peano 
arithmetic. 


More combinatorial insight was gained by a different approach due to Ketonen 
and Solovay (1981). They proceeded as follows: 

First, define a hierarchy of rapidly growing functions f,: w— w, a <é,. Here « 
is an ordinal number and «, is the first ordinal number A which satisfies w* = A; 
thus e, is the tower of w’s of “height w”’: 


w 
iat 


@ 


The functions f, are defined by transfinite induction on a: 


fi(n)=nt+1 ’ 
f\@)=2-a, 
Ein =he* 2 G)s 


n 


for a limit ordinal a@ we put 
f(a) = f,,(”) « 


Here (for a limit ordinal number a) {a,} is the sequence with the quickest 
convergence to a. It is here where one uses properties of ordinals a < e,: Every 
such ordinal can be written uniquely in the Cantor normal form 


a=o0%k, tw k, t+ +0"k,, 


where a, > --- >a, 20 are ordinal numbers strictly less than @ and k,,...,k, 
are positive integers. a, is then defined by means of the last term w“’k,. For 
example for a =w” +w°**5 we define 


3 +3 
a, =o" +0” 4+a°**n 


2 hd 
and fora =w° +w°”5 we define 


os : 


Besides of these elements of ordinal arithmetic we shall make use of only one 
result of mathematical logic. 


Theorem 3.6 (Wainer 1970). If f: wa@— is recursive and provably total (in PA), 
then there exists a < €, and a positive integer ny such that f(n)<f,(n) for alln > ny 
(i.e., f is eventually dominated by a function from the hierarchy). 


Now finally consider the function r*(p, t, n) defined as follows: 


r*(p,t,n)=min{N: N->(n)?}. 


SREB CAA PR Nn ee eena "EO CD 


nen ee a RTP ET ETT PEE ATES EER 
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tt follows from the first pact of the Paris-Harrington theorem that r*(p, ¢, 2) is a 
total function. Ketonen and Solovay (1981) now prove by induction on @ that the 
diagonal function 


r*(p, p,p +1) 


eventually dominates every function f,, a <«,.. Combining this with the Wainer 
theorem, this shows that r*(p, p, p + 1) fails to be provably total recursive. 

The Ketonen-Solovay approach is very interesting from a combinatorial point 
of view since it provides a combinatorial reason why the Paris—Harrington 
theorem fails to be provable: the Ramsey function r*(p, t, 1) grows so fast that 
one cannot express it by standard operations. 

In other words, the standard combinatorial approach to a problem ‘estimate 
numbers r*(p, t,n)’” does not always have in general a solution — at least in the 
sense in which it has been asked. 

This is a new phenomenon in our old finite (safe and paradox-free) com- 
binatorics. 

This is not to say that one cannot obtain interesting results in special cases. 
Indeed, this has been done for numbers r*(n) = r*(2, 2,7) by Erdés and Mills 
(1981), where, in particular, it was proved that 


ne" <n) <n 
for convenient constants a and B. 

Recently, Loebl and NeSetiil (1992) found a different and simple proof of the 
Ketonen-Solovay result. This has been found in a broader context and the author 
cannot resist mentioning one more result which follows the program of the 
combinatorial study of undecidability. (See NeSetril and Thomas 1987, Paris 1990, 
and Loebl and NeSetiil 1991 for surveys of this development.) 

One such example may be derived from Kruskal’s theorem (see chapter 5 of 
this book): Let us denote by KT(c) the following statement: ‘For every k 20 
there exists n(k) with the following property: If T,,..., 7,4) are trees which 
satisfy |T,| <k +c log(i) for alli=1,...,n.(k), then there are two distinct indices 
i and j such that T, contains a subtree homeomorphic to T,.” 

Now we have a result obtained by Loebl and MatouSek (1987). 


Theorem 3.7. 
(i) The statement KT(c) is true for every c. 
(ii) The statement KT(1/2) is provable in the theory of finite sets. 
(iii) The statement KT(2) fails to be provable in the theory of finite sets. 


Sketch of Proof. (i) In fact much more is true. If f: N->N is any function then 
there exists n(f) such that the following holds: If 7,,.-., 7, ,) are trees 
satisfying {7;| <f(i) then there are two distinct values / and j such that 7, contains 
a subtree homeomorphic to T). 

Suppose that this fails to be true for a particular function f. Thus, for every n, 
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there exists a family 7},..., 7) of trees which violates (i). But since there are 
only finitely many possibilities for T we find an infinite set /,, such that T) = 7, 
for all n E/,. Then we find an infinite set /, CJ, such that 7} = 7, for all n EJ, 
etc. For the resulting sequence 7,, T,,...,7,,... we apply Kruskal’s theorem 
and obtain two trees 7, and T, one of which contains a homeomorphic copy of the 
other. However, T,= 17, T; = T;, for some n, which is a contradiction. 

(ii) Follows from the fact that there are <4" pairwise non-isomorphic trees 
with n vertices. 

(iii) is based on the analysis of the so-called Hercules-Hydra game which was 
analysed in the papers Kirby and Paris (1982), Loebl (1988, 1992) and NeSetiil 
(1984). O 


Theorem 3.7. strengthens an earlier result of Friedman (see Simpson 1985) and 
it is perhaps the most striking current combinatorial example of unprovability. 

Let us end this chapter by the following remark: However surprising it may be 
at first glance, it is only fitting to mention in a chapter on Ramscy theorem a 
result related to well-quasi-ordering theory (WQO). Both theories share many 
similarities (see, e.g., Leeb 1973) and several results have proved to be mutually 
fruitful. For example Nash-Williams proved the following strengthening of 
Ramsey’s theorem in the context of WQO-theory. 


Theorem 3.8 (Nash-Williams 1965). Let M be a Sperner-system of subsets of an 
infinite set X (i.e., no two distinct elements of Mare inclusion-related). Then for 
every finite partition M = M,U +-- UM, there exists an infinite subset Y of X such 
that all members of M which are subsets of Y belong to one of the classes of the 
partition. 


(This generalizes the Infinite Ramsey theorem, C) is a Sperner system.) 
It is interesting to note (as observed already by Erdés and Rado) that Ramsey’s 
theorem does not hold for partitions of infinite subsets in a very strict sense. 


Theorem 3.9 (Erd6s, Rado). For every infinite set X of cardinality K holds 
KA). 


Sketch of a proof (NeSetil and Rédl 1985). For any well ordered set X define a 
graph G = (V, E) as follows: 


v=(*), (A, BEE iff A={a,<a,<---}. 


The graph does not contain a circuit thus y(G)<2 which in turn induces a 
2-coloring of (“) violating K>()3. 0 


Although a simple argument, yet this involves the axiom of choice. This 
nonconstructive feature of Ramsey-type theorems has been studied in great detail 
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in the context of mathematical logic Iockush 1972). Also theorems 3.5 and 3.7 
can be interpreted in this framework. In another important direction many results 
were obtained about the validity of w—(w); for constructive partitions. The 
constructive partitions are usually defined in topological terms as open, Borel or 
Baire-partitions. Theorem 3.8 played an important role in this development of 
“topological Ramsey theorems” by Galvin, Prikry and others. We shall comment 
on this later in section 4.4. See the survey article by Carlson and Simpson (1990). 

This development has a finite analogy: if we restrict partitions by a structural 
condition it is expected that a (much) larger homogeneous set will be found. 
Various types of such restrictions are provided by the following papers: Alon 
(1990) (solving a conjecture of Babai-partitions defined by polynomials); Erdés et 
al. (1983), Babai (1985), Alon et al. (1991) (various local conditions such as 
“anti-Ramsey” theorems: no intersecting edges get the same color); Sparks 
(1993) (definable colorings); Larman et al. (1992) (graphs defined by the 
intersection of n convex sets have either w or a =n'"). The paper by Erdos and 
Hajnal (1977) is one of the papers who initiated the development of quasi- 
random graphs and it contains the following problem which may be viewed as a 
root question for this kind of problem. 


Problem (Erdos and Hajnal 1977). Suppose H is a fixed graph and a graph G 
with n vertices contains no induced subgraph isomorphic to H. Is it then 
necessarily true that either a(G)=n* or o(G) =n‘ for some fixed e >0? 


Finally, returning to our main theme, let us remark that recently Kriz and 
Thomas (1990) have applied the ordinal-type techniques (known in WQO) to 
countable Ramsey theory. 


4. Full Ramsey-type theorems 


Ramsey’s and Van der Waerden’s theorems together with the companion results 
of Schur and Hales and Jewett are milestones of Ramsey theory. We call them full 
Ramsey-type theorems as they assert that every sufficiently large structure for every 
partition of a given type contains a given homogeneous structure (graph, cube, 
segment of integers, etc.). 

In others words if we have a “bad partition” of a structure S then there is an 
absolute bound on size of S. 

Natural examples of full Ramsey-type theorems are difficult to find and the list 
seems to be finite. With a few variations these are the theorems listed in this 
section. 


4.1. Rado sets of linear equations 


Both Schur’s theorem and Van der Waerden’s theorem fit the following more 
general schema: Let A = (a,,) be an integer m by n matrix. Then for every t> 1, 
there exists an integer N = N(A, t) which has the following properties: If the set 
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{1,2,...,N} is partitioned into ¢ classes then in one class of the partition there is 
a solution of a system of equations: 


a,,X,+ ++: +a,x, =0 

A,X, + +++ +4),x, =0 

: (4.1) 
QniX,t 77° +4,,,X, =0 


For the Schur theorem consider the matrix A consisting of the single row 
A=(1,1,-1). 


For the Van der Waerden theorem consider, e.g., the following matrix: 


ok 
I 


which corresponds to a stronger statement (also conjectured by Schur, see the 
introduction of Schur 1973) and proved by his student Brauer (1928) that in 
arbitrary finite partition of integers we can find in one of the classes the arithmetic 
progression ay, 4g + d,...,a,+(n—1)d together with the difference d. 

We can abbreviate (4.1) by writing 


Ax=0, x=(x,,....x,)" 


and even more considerably we can denote the system (4.1) of equations by 
F(A). 

The basic (and at first glance perhaps too ambitious) problem is to characterize 
those integral matrices A for which a result analogous to Schur’s and Van der 
Waerden’s theorems holds. This leads to the following notions: 

The sect of equations Ax =0 is said to be partition regular if for any finite 
partition of positive integers N there is always a solution of the system (4.1) in 
one of the classes. (By compactness, it docs not matter whether we formulate the 
finite or infinite version.) 

Note that obviously not every sct of cquations is partition regular, c.g., 
consider x = 2y ~ 1 and a parity argument. However, one should say that it is 
surprising that one can characterize all partition regular systems as follows: 

An m by n matrix A=(a,,) is said to satisfy the columns condition if it is 
possible to order its column vectors a,,...,4a, so that for some choice of indices 
l<n,<n,<-+-- <n,=n, if we set 
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then 

(i) 6, =9, 

(ii) for 1<i<¢, the vector b, can be expressed as a rational linear combination 
of columns a,, 1Sj<n,_,. 

Now we can formulate the following. 


Theorem 4.1 (Rado 1933). The system Ax = 0 is partition regular if and only if A 
Satisfies the columns condition. 


In neither direction this is a trivial result. For modern write-up see Graham et 
al. (1980). 

Perhaps it is interesting to note that in one direction (necessity) it is sufficient to 
use partitions of integers modulo a prime. This is the easier part. In order to 
prove sufficiency one generalizes tricks which arc involved in a proof of the 
Hales—Jewett theorem. However, most fitting in this context is to use Deuber’s 
axiomatization of subsets of positive integers for which the analogue of Rado’s 
theorem holds. More precisely: a set X of positive integer is called large, if for 
any partition regular system Ax =0 and any finite partition of X, one of the 
classes of partition contains a solution of Ax =0. 

Using this terminology Rado’s theorem reduces to saying that the set of positive 
integers is large. The following proved a long-standing conjecture of Rado (1933). 


Theorem 4.2 (Deuber 1973). If X is a large set then for every finite partition of X, 
one of the classes is again a large set. 


In order to obtain this result, Deuber provided the axiomatization of large sets 
by means of the following concept: 

A set X of positive integers is called an (i, p,c)-set if there are positive 
integers y,,..-,y,, such that X is the set of all linear combinations of the form 


m 
SA 
a | 


where the vector (A,,...,A,,) itself has form (for some 1 <m) 
20h anf OA ro ine) 
and |A,|<m, j=it+1,...,m. 


It has been shown by Deuber (1973) (see Leeb 1973 for a simplified proof) that 
for every choice of positive integers m, p, c and ¢ there exist M, P, C such that 
any ¢-coloring of an (M, P, C)-set always contains a monochromatic (m, p, c)-set. 

This implies Rado’s theorem and Deuber’s theorem since regular sets may be 
characterized by (m ,p,c)-sets: a set X is large if it contains an (#7, p,c)-set for 
every choice of positive integers (m, p,c). 

We do not prove Theorem 4.2 here, see, e.g., Graham et al. (1980). Let us 
indicate at least how the notion of an (m, p,c)-set naturally emerges: We prove 
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that a regular system of equations always has solutions which form an (m, p, c)- 
set. 

This is a consequence of the columns condition. Let us be more specific. Let A 
be an m by n matrix satisfying the columns condition. Let 1<n aaa Hees 
n, =n be the corresponding partition of columns of A (denoted by Assia: 
Since we are interested in solutions of a system of equations Ax' = 0, we may 
assume that the rows of A are linearly independent. Using the columns condition, 
a set of n—m linearly independent solutions of Ax' = 0 has the following form: 


ny ny ny, 
pa ————————— 
+ 
(1 oe 105.1, 05054 0230, 00555 0) 
+ 
(ij Gees Keg a Mg cing Vg Vig Sieg OLO 53 e's 0) 
3 T 
(kK) potas: Maa Ayes aang Weeds 1) 
a 
Cora Kretnys veces estate ds tees Ketan) 
: 1 
(Ky oingele Ky many? eee | > ee as 11 K, ace) 


with rational entries. Thus we may multiply by a sufficiently large integer, say c, 
to obtain linearly independent integer vectors of the form 
eles of 0,03 to Os 50,1. 0)" 
T 
By = RNa sie sg Ang Chek e585 oe 2235, 0) 


\ fea: Ore eee) 


: - 
Xn-m — (A nis eres Anema? Soeatey An—ma) . 


Set p = |max A,,|. Now let x be a solution of Ax = 0. Then there are y,,..., ¥,-, 
such that 
n-m 
x= >» YX; 


i=l 


and thus cach entry of x belongs to an (n —m, p,c)-set. 0 


The above results were generalized in several directions some of which we shail 
describe below. Other generalizations are covered by Deuber (1975b), Voigt 
(1980), Bergelson et al. (1991, 1995) (non-homogencous systems of equations, 
infinite systems, equations in general algebraic structures). 


4.2 Parameter sets (the Graham—Rothschild theorem) 


The partition theorem for parameter sets generalizes the Hales—Jewett theorem in 
the same way that Ramseys’ theorem generalizes the pigeonhole principle. 
Parameter sets are higher-dimensional analogues of combinatorial lines. We state 
the theorem after introducing the neccessary notions. 
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Throughout this section let A = {1, Oe ..,@} be fixed. We call A the alphabet. 

Consider the N-dimensional cube A* over A. An n-parameter set in A” is a set 
A of the following form: there are non-empty disjoint subsets A,, A,,...,A, of 
{1,2,...,N} and a point f°=(f?,..., f%) of A™ such that A is the set of all 
points f € A” which satisfy 


f£? forié UA,, 
f, fori, EA, jud,...,n. 


f= 


We also say that the parameter set A has been determined by (S dagte, Crete, a Pi 
We always choose the order of A,,..., A, such that 


min A, <minA,<--: <minA,. 


A, is called the ith moving coordinate. \t is clear that |A|=a" and we can 
schematically visualize A by a picture like fig. 4.1. 

A’ itself is an N-parameter set. If A is an -parameter set and p <n denote by 
CG) the set of all p-parameter subsets of A. 

It is fair to say that one of the turning points in Ramsey theory was the 
following theorem established by Graham and Rothschild (1971). 


Theorem 4.3 (Ramsey’s theorem for parameter sets). Let n, t be positive integers, 
let p be a nonnegative integer, and let A be a finite alphabet. Then there exists 
N=GR(A, p,t,n) with the following property: If C is an N-parameter set and G ') 
is colored by t-colors then there exists an n-parameter set B' for which G ') isa 
monochromatic Set. 


We denote the validity of this statement again by N- (n)? having in mind that 
this arrow is shorthand notation which should be interpreted for parameter sets. 
The proof of the Ramsey property of parameter sets was originally rather difficult 
(Graham and Rothschild 1971); see Leeb (1973) and Deuber and Voigt (1982) for 
simplified proofs. We give here a proof which proceeds by complete analogy with 
Proof II of Ramsey’s theorem stated in section 1. 


Ay Az A3 


Figure 4b. 
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Proof. The beginning of a p- parameter set A which is determined by 
(f°, A\,...,A ») is the function ae oo Finn 4,-1- min A, is the length of the 
beginning. We say that a coloring of fe ) is good or that G Yi is well-colored if any 
two p-parameter sets with the same beginning have the same color. 

Our proof proceeds by induction on p. The theorem for p = 0 reduces to the 
Hales—Jewett theorem. In the induction step let us assume the validity of the 
theorem for p—1 and arbitrary A, n, ¢. Ramsey’s theorem for parameter sets 
then follows from the following two claims. 


Claim A (Sufficiency of well-coloring). The following two statements are equiva- 
lent: 
(1) For every n, t, there exists N such that if C is an N-parameter set and () is 
ite by t colors then there exists B' & (£) which is well-colored; we denote this 
N (ni. 


O N—(n)? (for parameter sets). 


Claim B (Good well-coloring lemma). For any 1, m, ¢ there exists an N such that if 
C is an N-parameter set then the ? following holds: For every t- eas of CG ') there 
exists an n-parameter set B & (©) determined by (f",A,,...,A,) such that ifA 
and A’ are p-parameter subsets of B with the same hecnnine of length at most 
min A,,, then A and A’ have the same color. 
We denote this statement by N onl: Clearly, setting n=m we obtain 
N x2 nye 


Proof of Claim B. We use the induction assumption for the Graham—Rothschild 
theorem for p ~ | (for arbitrary choice of A, n, m, ¢) and then (for given p, t, and 
n arbitrary) we proceed by induction on m. We allow m= 0 in which case the 
statement is trivial (V =n). Set [Al =a. 

In the induction step, let ¢, n be given. 


N,=GRatl,p-1,t"",n-—m), 
N,=N,+m+1, 
(N.); - 


good, a 
We prove 


iY ada: mt (ny; 
Without loss of generality, sect C= A” and let c be a fixed & coloring of Gi 
First apply the definition of N and let B’ be an N,-parameter subset of C 
determined by f", Ay,... » Ay, such that any two p-parameter sets in B’ with the 
same beginning of length at most min A,, have the same color (cf. the definition of 
—> ). Set A= min A,,,,- 


good,m 

Now consider the set @ of all p-parameter subsets (g°, [1,,..., fl 1») of B’ for 
which min I, = A. Every such p-parameter set A (over A) may be ‘considered aS a 
(p—1)-parameter set A determined by (¢°,1h,.. .,11,) over (the enriched 
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alphabet) AU {a + 1} =A by simply setting: 


0 _J& if g° is defined , 
8 lati ifiell, — {a}. 


Now define a coloring of (A*" 1) by 
c(A') = (c(A);A Ef, A =A’). 


Given A’ & (Ay ,) there are (exactly) a™ parameter sets A for which A =A’ and 
thus we apply the definition of N, to obtain a (n — m)- Re lio set 4 in (Ay \) 
which is ¢-monochromatic. Let B be determined by (f , A,,.,,---,A,)- Define 
the n-parameter set B given by (A", ,,..., 0) by 


0 


h° =f! if f° is defined, 
» =A, fori<m, 


> =f fl =at uy, 


> =A, fori=mt,...,n. 
It can be easily checked (mainly from the definition of ¢) that any two p- 
parameter subsets of B with length of beginning <min A,,,, have the same color 
(in the coloring c). O 


Proof of Claim A. Clearly (2) > (1). The reverse implication follows clearly from 
the following truncated version of the Hales—Jewett theorem (isolated first by 
Voigt 1980) which thus plays the role of the pigeonhole principle in Proof If of 
Ramsey’s theorem. This we state as Claim C. 

First let us introduce the following: Let A~” denote the set J" , A”. If B is 
an n-parameter set determined by (f", A,,...,A,,) then a partial point of B is a 
point x of A” for an m<n which would become a point of B if we define its 
coordinates for i>m by suitable constants (either f’ or, say, 1). Now we can 
formulate the following. 


Claim C (Hales—Jewett’s pigeonhole). Let A, 4, a be fixed. Then there exists 
N=HIJ*(n, t) such that for every t-coloring of A~™ there exist an n-parameter 
subset B of A™ such that the set of all partial points in B is monochromatic. 


Clearly, in order to finish the proof of Theorem 4.3 it suffices to prove Claim C. 
Proof of Claim C. Without loss of generality assume ¢=2 and denote by N = 
N(n,,n2) a positive integer such that for every 2-coloring of A~™, there exists an 
i€ {1,2} and an n,-parameter subset B of A”, whose partial points are 
monochromatic. We proceed by induction on n, +,. In the induction step we 
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prove 
Mn,,2,)<HI(A, 2", 1)+N=MHN, 


where we put x= 1+ {A *| and N= max{N(a, -- 1,45), HJ" (t,t. > 1d}. To- 
wards this end, let ¢ be a 2-coloring of A~"”*”. Define a (2)-coloring c’ of A” by 


e'(f)=(c(f. 8): EA ")UCC(f))- 


Let B’ be a c'-monochromatic line in A“. In particular, B’ is a c-monochromatic 
line, e.g., let c(f) =1 for any f © B'. Now define the coloring c” by 


e"(g) = c(f, 8) 


for some (any) f © B’. 

By assumption there exists Bre(4 ") for which B"~” is c” monochromatic. If 
m =n, and B”~” is colored by 2, we are done since all partial points of B” have to 
be colored by 2. 

If m=n,—1 and B” “" is colored by 1 then we combine B” to a BE (47'") 
with B~"' colored by 1. O 


Let us remark at the end of this section that we could define n-parameter sets in 
a slightly more technical way: Let A be a non-empty sct and B (a possibly empty) 
subset of A. 

An n-parameter set (with respect to B) in A” is a non-empty set A of the 
following form: there are non-empty disjoint subsets A,,...,A, of {1,2,...,N} 
and a point f°=(f{,..-, fy) of B™ such that A is just the set of all points 
f © A™ which satisfy 


_ ffi foriZ U7, A). 
a fr fori, EA, FH Tyee (t- 


(Thus we just assume that the “affine shift” belongs to B”.) Note that if B= 
then this definition implies that (A,,...,A,,) forms a partition of {1,..., MN}. 
For fixed B, A, denote again by (*) the set of all p-parameter sets (with respect 
to B) in A. 
Note that the above proof actually gives the following slightly stronger result 
(also proved by Graham and Rothschild 1971). 


Theorem 4.4. Letn, t, p be positive integers, A a finite alphabet, B a subset of A. 
Then there exists N= GR(A, B, Pot, n) with the following property: If C is an 

N-parameter set (w.r.t. B) and CG ) is colored by t colors then there exists an 
n-parameter set B' for which CG ') is a monochromatic set. 


We shall make use of this refinement in section 4.4. 
In fact, Graham and Rothschild actually prove a version of the parameter set 
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theorem in which a finite permutation group is allowed to act on the entrics of the 
elements (see Graham and Rothschild 1971 for details). 


4.3. Vector and affine spaces 


The present author thinks that the Graham—Rothschild theorem 4.3 on parameter 
sets was the first theorem of the ‘““new Ramsey theory” age — the first theorem 
with a proof which displayed the richness of the field of structural extensions of 
Ramsey theory and attracted so much attention to it. Moreover, it led directly, 
using a convenient formalism, to a solution of the following problem which has 
been responsible for much of the development in the sixties: 


Rota’s conjecture. The analogue of Ramsey’s thcorem holds for finite vector 
spaces. : 


This was a nice and natural problem since there are many similarities between 
the structure of subsets of a set and the structure of subspaces of a finite vector 
space. To be more specific, let F = GF(q) be a finite field with q elements, V an 
n-dimensional vector space over F. Denote by (;) the set of all p-dimensional 
vector subspaces of V. 

Using this, Rota’s conjecture looks formally similar to Ramscy’s theorem: For 
every finite field F and every choice of positive integers p, t, n there exists 
N=N(p,t,n), such that for every t-coloring of (;) for an N-dimensional vector 
space V over F, there exists an n-dimensional subspace U such that (1) is 
monochromatic. 

Let us remark that although points of V (i.c., (4 )) form just the set F” (or (4'), 
i.e., the set of all 0-parameter scts) this analogy docs not hold for p> 0: Every 
p-parameter set may be regarded as a p-dimensional vector subspace of V; 
however, this does not hold in the other direction. (Figure 4.2 schematically 
illustrates a 2-dimensional subspace of V; compare it with fig. 4.1.) Motivated by 
this analogy I-parameter sets are called combinatorial lines, ete. 

Note that the size of (7), denoted by [(],, (a Gaussian coefficient) shares many 


Figure 4.2. 


1366 J. Nesetril 


ptoperties with binomial coefficients. In particular, a formally similar definition is: 
ee =(q" = 1)(q" —q)-- -(q" Egret ly 
Gi = DG aire) 


However, these analogies were not of much help in solving the problem and it 
took nearly 10 years before Rota’s problem was solved. 


Theorem 4.5 (Graham et al. 1972). For every finite field F and any choice of 
nonnegative integers p, n, t there exists N= GLR(p,n,t) such that for every 
t-coloring of (. ) for an N-dimensional vector space V over F, there exists an 
n-dimensional subspace U such that the set (, ) is monochromatic. 


The original proof relied on categorical formalism and was not easy. A simpler 
proof was given by Spencer 1979) while the proof along the lines given in the 
preceding section (via good colorings) was given by Deuber and Voigt (1982). 

A companion result deals with affine spaces and is also due to Graham et al. 
(1972). 

The mutual interplay of parameter sets and of vector and affine spaces, or to 
put it in other words, of combinatorial and “geometrical” lines, is very interest- 
ing. While for the coloring results (Ramsey type theorems) the algebraical 
structure involved in Rota’s conjecture presented difficulty and thus this conjec- 
ture has been solved later than its combinatorial counterpart (i.c., parameter sects) 
for density theorems this has been the other way around. The density version of 
Theorem 4.5 was established by the following: 


Theorem 4.6 (Furstenberg and Katznelson 1985). For every finite field F, positive 
integer n and a positive real ¢ >() there exists a positive integer N = KF(e, F,n) 
with the following property: If X is a set in an N-dimensional vector space of size at 
least ¢-|F|% then X contains an n-dimensional vector space. 


But the density of Hales~Jewett’s theorem appeared much harder. Although 
the boolean case was settled (Brown and Buhler 1982, R6dl 1982) and also some 
reductions were known (Brown and Buhler 1984) the lack of symmetries of cubes 
A™ left the problem open for several: years (well after the first version of this 
paper was written). The problem was solved in the beginning of 1990: 


Theorem 4.7 (Density of Hales—Jewett’s theorem; Fiirstenberg and Katznelson 
1991). For every alphabet A, positive integer n and positive real ¢ > 0 there exists a 
positive integer N = N(e, A,n) with the following property: if X is a subset of A™ 
N then X contains an n-parameter Set. 


The difficulty of handling combinatorial lines in density questions is a bit 
surprising since in most Ramsey-type questions both combinatorial- and vector- 


a aa a a en ala tn 
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(affine)-structures are usually closely related. In fact there exist just a few 
instances of problems where these two structures differ. 

One of them is so-called selectivity, see, e.g:, Erd6s ct al. (1984b). Namely, if 
the points of a large dimensional vector-space are colored by an arbitrary number 
of colors then one of the n-dimensional subspaces is either monochromatic or 
totally multicolored (meaning that no two points have the same color). 

A similar selective property holds for Van der Waerden’s theorem (as shown by 
Erd6és and Graham 1980) and for Rado’s theorem (as shown by Lefmann 1986). 

However, this does not hold for Hales—Jewett cubes: Consider an arbitrary 
Hales—Jewett cube A*, A ={1,2,...,a}, a=3. Put c(1) =1 and ¢(2) =c(3) = 
-++=¢(a). Define a coloring c of A™ by 


N 


c(t... Xp) =D e(x,) 3". 


fol 


One sees easily that while there is no monochromatic line, in every line two points 
get the same color. Also this instance indicates the lack of symmetries of 
Hales—Jewett cubes. 

The full discussion of coloring patterns which may occur in unrestricted 
colorings is subject to canonical Ramsey theory, named after the Canonical 
partition lemma due to Erdos and Rado (1952). 


Theorem 4.8 (Canonical Lemma, Erd6s and Rado 1952). For every choice of 
positive integers p, n there exists N(p,n) such that for every set X, |X|=N(p,n) 
and for every coloring c: C)>N (i.e. coloring by arbitrary many colors) there 
exisis a set YCX, |Y|=n, such that the coloring c restricted to the set (7) is 
canonical. Here a coloring c is said to be canonical if there exists a set w C{p]| such 
that 


C(x <0 <x) = cle, << +  <xh) if x, =x, fori€w. 


(Thus there are 2’ canonical coloring patterns.) 


Coloring patterns for Hales-Jewett cubes were determined by Promel and 
Voigt (1983), see also the survey Promel and Voigt (1985). 

Erdés and Rado’s proof yiclds the upper bound of order £,, (while obviously 
N(p,n)2=r(p,n—1,n)). Recently Lefmann and Rédl attacked the problem of 
estimating the numbers N(p, x). They solved the problem for pairs (Lefmann and 
R6dl 1993, 1995) and very recently Shelah (1994) solved the problem for every p 
by showing that both upper and lower bound use p — 1 exponentiations. For an 
elegant proof of the Erd6s-Rado Canonization lemma see also Rado (1986). 

Due to space limitations we cannot describe these interesting results in greater 
detail. This chapter belongs to the “Aspects” part of this Handbook of Com- 
binatorics but we are unable to cover even one aspect of a single theory. 
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4.4. Finite union theorem and other variations 


Several particular cases of partition regular systems of equations were studied in 


their own sake. Apart from Van der Waerden’s theorem this is particularly true of 
the Finite union theorem. 


Theorem 4.9 (Rado 1933, Folkman, Sanders 1968). For every choice of positive 
integers n, t, there exists N=FU(n,t) with the following property: For every 
partition of {1,...,N} into t classes one of the classes contains n distinct numbers 
together with all their sums. 


The theorem gets its name after an equivalent formulation in terms of partitions 
of subsets. 


Theorem 4.10 (Finite union theorem). For every choice of positive integers n, t, 
there exists N with the following property: If X is a set of size at least N and if the 
power set 2“ is partitioned into t classes then in one of the classes we can find n 
disjoint sets, together with all their (non-empty) unions. 


Let us remark that a weaker version of this result is perhaps chronologically the 
earliest instance of a valid Ramsey-type statement (not counting Dirichlet’s 


pigeonhole principle): Hilbert (1892) published the following result (which we 
formulate here in modern terms). 


Theorem 4.11. For any positive integers n, t, there exists N= H(n, t) such that for 
every partition of the power set lattice 2“ for \X|=N into t classes, one of the 
classes contains a sublattice isomorphic to 2". 


Proof. Apply the Hales—Jewett theorem for n and ¢ to show that H(n,t)< 
HJ(n, 1). O 


(In fact a direct approach here yields a much better result, see Brown et al. 
1985 and references given there.) 

Let us remark that one can apply the Hales—Jewett theorem to prove a similar 
theorem not only for lattices in general but for all varieties of lattices, e.g., 
modular lattices, see Jezek and NeSetril (1983) and Prémel and Voigt (1981) for 
details. 

The Finite union theorem may also be proved from the Graham—Rothschild 


theorem for parameter sets. Let us give a simple proof as an application of Van 
der Waerden’s theorem only. 


Proof of Theorem 4.9. For simplicity we prove the statement for ¢=2 in the 
following form: For any pair n,, n, of positive integers there exists FU(n,,2,) = 
N with the following property: For every partition {1,...,N}=a, U4, either a, 
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contains an n,-set together with all sums or a, contains an 1-set together with all 
sums. ; ; : 


We proceed by induction on n, + n,, and in the inductive step we show that 


FU(4,,n,) = VW(1 + FU(n,, 2, — 1), 2). 


To this end, set FU(n,, 2, — 1) = FU(n, ~— 1,2,)=M, VW(1 + M, 2) =N and let 
{1,...,N} =a, Ua, be a fixed partition. Then there exists a, and d > 0 with all 
ay, 4, + d,...,@),+ Md in one of the classes, say a). 

Now consider the set {d,2d,..., Md} under the given partition. Then either 
there exists an ,-set together with all sums in a, and we are done, or there exists 
an (n,~- 1) set dx,,...,dx,_, such that all its sums are in a,. But then d, 
dx,,...,dx,_, is a desired n,-set. O 


The Finite union theorem has always been related to (a version of) the 
Hales—Jewett theorem and thus the existence of a good (even of a primitive 
recursive) bound presented a problem. Using Shelah’s proof one now has such a 
bound (of the order of the sixth-function of the hierarchy). However, using direct 
methods one can get bounds of order f, (the tower function), see Taylor (1981), 
NeSetiil and R6dl (1983b) and Graham and Rédl (1987). 

The Finite union theorem has a countable analogue, namely Hindman’s (1974) 
theorem. This is one of the few known partition regular infinite systems of 
equations, and in fact for infinite systems of equations one cannot hope for a 
statement similar to Rado’s theorem as shown recently by Deuber et al. (1995). 
Hindman’s theorem and its ultrafilter proof due to Glazer led to an important line 
of research. Summarizing, there are at least three possible approaches to these 
coloring problems: using topological dynamics (e.g., the linear Van der Waerden 
theorem proved in Fiirstenberg and Weiss 1978), using Stone~Cech compactifica- 
tion of the corresponding structures (Bergelson and Hindman 1988), and using 
theory of ultrafilters (see, e.g., Carlson 1988). Although these approaches are 
related several of these results were obtained independently and in different 
context. And there were found several ‘‘master theorems”. One of them is 
certainly the following very recent result which was conjectured by Furstenberg 
and proved by Bergelson and Leibman (1995). 


Theorem 4.12 (Polynomial Van der Waerden Theorem). Let p,,..., p, be 
polynomials with rational coefficients taking integer values on integers and 
satisfying p(Q)=90 for i=1,...,k. Then for every finite partition of and for 


every choice of numbers v,,...,v, there exist integers n, u and d such that 
ut+pd)v,, fori=1,...,k belong to the same class of the partition. 
(Choose p(n) =---=p,(n) =n, v, =i to get Van der Waerden’s theorem.) 


This theorem is proved by a transfinite induction over countable ordinals thus 
sharing some similarities with section 3.3. Perhaps there is a deeper connection 
here yet to be discovered. 
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Graham-Rothschild’s theorem has many applications. For example if A= 
{1, 2} then we obtain the Ramsey property of finite Boolean algebras since the n- 
parameter word corresponds to a Boolean subalgebra isomorphic to 2”. In the 
terminology of section 5 this proves that the finite Boolean algebras form a 
Ramsey class. 

We get another interesting case if we consider B = 6, A = {1, 2} (in Theorem 
4.4.). As remarked earlier, in this case the moving coordinates of an n-parameter 
word form a partition into 7 classes and it is not difficult to check that A is a 
p-parameter subset of an 2-parameter set B iff the partition which corresponds to 
B is a refinement of the partition which corresponds to A. Thus in this case we 
obtain the following result which is of independent interest: Let Eq(X) denote the 
set of all equivalences on X. We set E < E’ if the equivalence E’ (as a relation) 
contains E (i.c., if E is a refinement of £’). Given an equivalence EF and a 
positive integer . denote by C .) the set of all equivalences with p classes which 
are coarser than E. Then we have the following. 


Theorem 4.13 (Dual Ramsey theorem), Let p, n, t be positive integers. Then there 
exists N= DR(p,t,n) with the following properties: If X is a set with at least N 
vertices and c is a coloring of (£4*)) then there exists an equivalence E with n 
classes such that CG ) is monochromatic. 


Proof. Apply the Graham—Rothschild theorem 4.4 for B=9, A= {1,2} for p 
and n as above. 0 


Although this is given and proved in Graham and Rothschild (1971) and 
although one teitmotiv of Leeb’s (1973) work was his interpretation of parameter 
sets as the dual category to the category of all finite sets, the tempting idea of a 
dual Ramsey theorem has been rediscovered several times (see NeSetril and Rédl 
1980 and Carlson and Simpson 1984). Carlson and Simpson (1984) proved an 
important infinite topological dual Ramsey theorem which may be viewed as a 
culminating result for paramcter-set-type theorems; see Carlson and Simpson 
(1990) for a survey of this development. 

There are other applications of the Graham—Rothschild theorem. For example, 
we already obtain from the Hales—Jewett theorem the following (known as the 
Gallai-Witt theorem). 


Theorem 4.14. For every t, n and m and for every t-coloring of the m-dimensional 
integer lattice points, there exists a monochromatic homothetic copy of 


{1,...,}". Explicitly there exists (a,,a,,...,4,,) and d>0 such that all points 
of the form 


a,t+id,a,tid,...,a, tid), i,.. 
1 { 2 2 at iy 


m 


i,,€ {(0,...,n—1} 


ilar: bn 


are monochromatic. 
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Proof. Let 1, m, n be given. Set A = {0,1,...,n-1}, A=A™. Thus ACZ”. 
Using the Hales-Jewett theorem set N = HJ(A, 1, n”"). Choose positive integers 
k,,...,k, so that the mapping 


f. A%—>2z" 


defined by 


N 
f+. - tn) = ea kX; 
i= 


is injective (this can be easily guaranteed; e.g., set k, =(N+1)'). Now any 
t-coloring of Z” induces a t-coloring of A” which contains a monochromatic line 
L determined by gy © A” and /C{1,...,N}. The image /(L) of L induces a 
configuration in Z” which is homothetic to {0,1,...,n— 1}. Explicitly, this 
configuration is determined by a,, a,,...,a4,, and d where 


some 


N 
(4,, Ses 14,,) = > k gyi) and d 7 pe k, ? 5 
i=l 


It follows from this that for every finite subconfigurations K of Z” and for every 
finite coloring of Z” one can find a monochromatic configuration which is 
homothetic to K. 

However, the situation drastically changes if we insist that we find a mono- 
chromatic configuration which is congruent to a given one. This was the starting 
point of Euclidean Ramsey theory which deals with the following problem: Which 
finite configuration (up to a Euclidean motion) can we always find monochromati- 
cally in any finite coloring of the points in E”, where vn is sufficiently large 
depending on the number of colors?) (Such configurations are called Ramsey.) 

This problem is far from being solved and several partial results were obtained 
in a series of papers by Erdos et al., see, e.g., Erdds et al. (1975a). Recently the 
progress has been quick: Frankl and Rédl (1986) proved that every triangle and, 
more generally, that every nondegenerated simplex (Frank! and Rédl 1990) is 
Ramsey. This has been quickly generalized to trapezoids, pentagons and other 

notoriously difficult small configurations by Kriz (1991, 1992a). Kriz also applied 
* deep methods of algebraic topology to this problem Kiiz (1991, 1992b). Other 
recent striking results are contained in Bourgain (1986) and MatouSek and Rédl 
(1995) and the description of this recent development is given in Graham (1990, 
1995). See also chapter 17 in this Handbook. 

Another problem related to the multidimensional generalization of Van der 
Waerden’s theorem is the density problem. 


Theorem 4.15 (Furstenberg and Katznelson 1978). Let « >0, let d be a positive 
integer and X a subset of Z¢ with a positive upper density. Then X contains a 
homothetic copy of {1,...,n}" for every positive integer n. 
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The proof of Furstenberg and Katznelson used infinitary methods of ergodic 
theory. No combinatorial proof is known (even for this “simple” result). 

Other geometric density results motivated by Van der Waerden’s theorem were 
proved by Pomerance (1980) and Bourgain (1986). 

A final comment on density problems: One may ask whether a density theorem 
is valid for all other partition regular systems of equations Ax = 0. 

The answer is negative as has been observed by Frankl et al. (1988). 


Theorem 4.16. For a partition regular system Ax =(Q which has a solution 
consisting of pairwise distinct entries, the following two statements are equivalent. 
(1) A-1=0 where 1=(1,...,1)'. 
(2) For every set X of positive integers with a positive upper density the system 


Ax =0 


has a solution in X which consists of pairwise distinct entries. 


Proof. This follows from Szemerédi’s theorem: 

(1) (2) Let x =(x,,....2,) be a solution of Ax =0 with all x, distinct. Put 
N= maxx, and let ¢ + jd. j=0,...,N be an arithmetic progression in X (by 
virtue of Szemerédi’s theorem). Now if Al =0 then also (by linearity) 


Ay=0, 


where y=c-1+d-x. All entrics of y have the form c+.x,d, are distinct, and 
belong to X. 

(2)>(1) Set N> Yi, , |a,| (where a,, is a general term of matrix A). Consider 
the set X= {Ny +1: y=1,2,...}. X has upper density 1/N. Applying (2), let 
x =(x,,...,X,) be a solution of Ax =0 in X. Thus each x, = N-y, + 1 and 


0= > a,x, =D a,(Ny,+ I =ND ayy, + D ay 
jel jzt j=l y= 


for every #=1,...,m. 
This implies Digs a,,=) for every i which in turn is just AN=0. O 


5. Ramsey classes 


5.4. Induced theorems 


The examples of full Ramsey theorems which we gave in the previous section 
share a common “Ramsey pattern’. As we remarked earlier it is difficult to 
enrich this list by essentially new examples. It seems that there somehow is a 
limited supply of finite structures (from “real life’) with the full Ramsey 
property. Nevertheless, there is a rich theory of structural extensions of Ramsey 
theory which is presently perhaps one of the most active areas of the subject. 
By now we all tend to consider this as a natural state of the art but the situation 
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in the carly 70s was far less transparent and the new concepts emerged slowly. 
These concepts then influenced the further devclopment in a profound way. 

It seems that one.of the. significant turns appeared in the late 60s when Erdos, 
Hajnal and Galvin started to ask questions such as which graphs arrow, say, a 
triangle. Perhaps the essential parts of the whole development can be illustrated 
with this particular example. 

We say that a graph G =(V, E£) is t-Ramsey for the triangle (i.e., K,) if for 
every (-coloring of E, one_of the classes contains a triangle. Symbolically we 
denote this by G—>(K,)?. Ramsey’s theorem gives us K,—> (K,)3 (and 
Kasay? (K,);). 

But there are other essentially different examples. For example, a 2-Ramsey 
graph for K, need not contain K,. 

This was shown by Graham (1968) who constructed an optimal graph with this 
property: The graph K, + C, (depicted in fig. 5.1) is the smallest graph G for 
which G,—>(K,)3 and which does not contain K,. 

Yet K,+C, contains a K, and subsequently, Van Lint, Graham and Spencer 
constructed a graph G, not containing K,, with G,—>(K,)3. Until recently, the 
smallest such an example was due to Irving (1973) and had 18 vertices. (It arises 
from a graph on 17 vertices which proves 1774(4)5, by adding one vertex of 
degree 17.) Very recently two more constructions appeared by Erickson (1993) 
and Bukor (1994) who found examples with 17 and even 16 vertices (both of them 
using properties of Graham's graph). 

The next question which was asked is whether there exists a K,-free graph G, 
for which G,—> (K,)5. This question proved to be considerably harder and it is 
possible to say that it has not yet been solved satisfactorily. 

The existence of a K,-free graph G which is 2-Ramsey for K, was settled by 
Folkman (1970). Folkman’s proof is complicated and the constructed graph is 
astronomically large. The same is true for simpler constructions of these graphs as 
provided by Nesetril and Rédl (1975a, 1981). 

Perhaps just to be explicit Erdés (1975) asked whether there exists a graph G, 
with <10"° vertices. This question proved to be very accurate and, building on 
earlier partial results of Szemerédi, Frankl and R6dl, it was shown by Spencer 
(1988) that there exists such a graph G, with 300-10” vertices. 

The proof of this statement is probabilistic. But the probabilistic methods were 


\ 
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Figure 5.1. 
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not only applied to get various bounds for Ramsey numbers. Recently also the 
Ramsey properties of the random graph Gy _,, were analysed and the problem has 
been recently solved by RGdl and Ruciriski (1993, 1995). More preciesely, Rodl 
and Rucirski determined (up to a multiplicative constant) for every graph G the 
threshold probability p = p(G, mt) for the property that almost every graph G,, , 
is Ramsey for G: Gw.p > (G)3- These results have further consequences which 
will be mentioned later. ; 

Let us return to the beginning of the 70s. The situation was of course much less 
clear. On one side there were several deep and difficult results (Szemerédi’s on 
r,(n); the Graham—Rothschild theorem, Folkman’s theorem), on the other side 
the proofs were difficult and provided no insight. There were also strange 
difficulties: Rota’s conjecture could be solved only for points, Folkman’s proof 
did not work for 3-colorings, etc. 

By means of the intensive research in the beginning of the 70s all these 
obstacles were resolved. An important part of this development was the fact that 
the theory got the proper footing in the work of Leeb (1973), NeSetril and Rodl 
(1975a, 1978a), Deuber and Rothschild (1976) and others. 

We shall give the main definitions. 

Let K be a class of objects which will be denoted by A, B, C,... (such as 
graphs, subsets of integers, spaces). 

Let K be endowed with isomorphisms and subobjects (such as subgraphs, 
subconfigurations, subspaces). 

Given objects A and B, denote by (“) the set of all subobjects of B which are 
isomorphic to A. 

We say that object C is (¢, A)-Ramsey for object B if for every t-coloring of the 
set ({) there exists a subobject B' of C which is isomorphic to B such that the set 
(4’) is monochromatic. 

We denote this by C-> (By. . (It will be understood from the context in which 
class K we work.) 

For AE K we say that the class K has the A-Ramsey property if for every 
object B of K and every positive integer ¢ there exists C of K such that 


C->(B).. 


In the extreme case where K has the A-Ramsey property for each of its objects 
A we say that K is a Ramsey class. 

The Ramsey problem for a class K (‘prototype theorem” in NeSet7il and Rédl 
1975a) consists of describing all those objects A for which K for which K has the 
A-Ramsey property. 

These definitions proved to be useful mainly because they provide a concise 
language for various extensions of Ramsey's theorem. 

After reading the initial sections of this chapter one perhaps sees that these 
definitions are natural for most of the theorems which were introduced. 

Every full Ramsey-type theorem induces a particular A-Ramsey property since 
it assures that every sufficiently large object is Ramsey. Thus we have: 
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(1) Set: The class of all finite sects together with subset inclusion forms a 
Ramsey class. 

(This is just Ramsey's theorem.) 

(2) A (simplicial class): The class of totally-ordered finite sets together with 1-1 
monotone mappings as subobjccts is a Ramscy class. 

(This is again another formulation of Ramsey’s theorem.) 

(3) Compl: The class of all complete graphs is a Ramsey class. 

(This is yet another reformulation of Ramsey's theorem.) 

(4) Par(A): The class of all parameter sets over A is a Ramsey class (the 
Graham—Rothschild theorem 4.3.) 

(5) AProg: The class of all finite subsets of integers and affine embeddings has 
the A-Ramsey property iff |A|=1. (Here if A = {a,<a,<--+ <a,} and B= 
{b, <b, <--- <b,} are two sets of integers then f: A— B is an affine embed- 
ding if there exists d>0 such that 6, = f(a,) + j-d iff b; =f(a;), 1<j<m.) 

This is the induced form of Van der Waerden’s theorem, sce Spencer (1988) 
and NeSetiil and Rodl (1976). 

(6) Vect F: The class of all finite vector spaces over F is a Ramsey class. 

(This is the Graham—Leeb-—Rothschild theorem 4.10.) 

One can formulate in this language further results which we obtained earlier. 
However, one does not have to look for full Ramsey theorems only. The 
weakening of the assumption to an existence of a Ramsey object proved to be 
both natural and useful: The Ramsey classes form a much richer variety than the 
(rather solitary) full Ramsey theorems. 

Let us list some examples of Ramsey classes together with their (by now 
canonical) names. 

Denote by Gra the class of all finite graphs together with induced subgraphs (as 
subobjects). Then we have the following solution of the Ramsey problem for 
graphs. 


Theorem 5.1 (Deuber 1975a, NeSetiil and R6édl 1975b). Gra has the A-Ramsey 


property iff A is either a complete graph or the complement of a complete graph 
(which will be called discrete). 


This extends results of Erdés et al. (1975b) and Rédl (1976) (the case A = K,). 


Proof. First we prove that Gra has the A-Ramsey property whenever A is 
complete or discrete. This is nontrivial and original proofs were difficult. 
However, one can derive this particular result from the Graham—Rothschild 
theorem on parameter sets (we follow the proof in NeSetril and Rédi 1985). 

Clearly it suffices to consider the case when A is complete. Set |A] =p. 

We use the easy and well-known fact that every graph G=(V,E) may be 
represented as a subgraph of the graph (A(X), {[Y, Z]: YN Z =6)) for suffi- 
ciently large X (it obviously suffices to assume {|X| = |E]), say |X| =n. Moreover, 
this representation may be chosen such that a given order on V coincides with 
minimal elements of the sets which represent vertices. 
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However, it now suffices to apply the Graham—Rothschild theorem 4.4 for 
B={1}, A={1,2} and p=2: Every n-parameter set in A” gives a mono- 
chromatic subgraph (monotone) isomorphic to (A(X), {[Y, Z]: YN Z = B}). 

The proof of the reverse implications uses the following technique developed in 
NeSetril and Rédl (1975b). 


Ordering lemma 5.2. Let G = (V, E) be a graph, < an arbitrary linear ordering of 
the set V. Then there exists a graph H = (W, F) with the following property: For 
every linear ordering < of W, there exists an embedding f: G—»> H which is a 
monotone mapping with respect to < and <. Explicitly: |x, y}EE iff 
[f@), (ONE F and x <y iff f(x) < f(y). 


Using the Ordering lemma we can prove the reverse implication in Theorem 
5.1. 

Let A be a graph which is noncomplete and nondiscrete. It follows that the 
vertex-set V(A) of A may be ordered in two different ways, say <, and <, so that 
(A, S,) and (A, <,) are monotone non-isomorphic. Let (A’, =) be the disjoint 
union of (A, <,) and (A, =,), ordered in such a way that = extends both =, and 

Let (B, <) be a graph guaranteed by the Ordering lemma for (A’, =). We claim 
that there is no C with C—>(B)}: Let C be a graph. Fix an arbitrary linear 
ordering <I of V(C) and define a partition (4) =a, Ua, as follows: 


A €a, iff (A, <J,,4,) is monotone isomorphic to (A, <,), 


A €a, otherwise. 


Using the ordering property of B we find that there is no homogeneous copy of B 
inc. O 


Proof of the Ordering lemma. We use the edge-Ramscy property of Gra which we 
established in the first part of the proof. . 

Let G be any graph with a fixed linear ordering < of its vertices. Let G be any 
graph with a linear ordering of its vertices such that both (G, <) and (G, =) 
contain an induced copy of G, monotone isomorphic to (G, <). Put G =(V, E), 
V= {v, <--> <v,}. We may assume without loss of generality that [v,, v;,,]E E 
for i=1,...,n—1 (for otherwise we may add to G a suitably ordered path from 
v; to 0,41). 

Now according to the first part of the above proof of Theorem 5.1 there exists a 
graph H = (W, F), W= {w, <--- <wz,}, such that for every partition F = F, U F, 
one of the classes contains an induced copy of G which is, moreover, monotone 
isomorphic to G. 

We claim that H has the ordering property for (G, <=). Towards this end let < 
be a fixed linear ordering of W, W= {v, < --- <u,}. Let < be an arbitrary linear 
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ordering of W. Define a partition a, Ua, of edges of Has follows: 
[u,, v,] € a, iff v,<v, andi <j, 


[v;, v,] Ea, otherwise . 


We get a monochromatic copy of G in one of the classes. From this it follows that 
for all edges [v,, v,], either the orderings < and < coincide or they are inverse. By 


the definition of G this implies that in either case we get a monotone copy of 
G. O 


Thus the ordering property may be deduced from the nonsingleton Ramsey 
property. However it is important that we are able to prove the ordering property 
independently of the Ramsey property for “almost” every structure, see NeSetiil 
and Rédl (1978b). 

This is also mirrored by the fact that one can find small example graphs H with 
the ordering property (for G with a vertices one can choose H of order a’ - log 1; 
see Rédl and Winkler 1989; see also Brightwell and Kohayakawa 1993 and 
Nesetril 1994 for recent related results). 

For certain structures the validity of an induced Ramsey-type theorem was not 
easy to prove. 

So it was for hypergraphs and (more generally) relational systems. This 
development culminated in the proof of Ramsey’s Theorem for structures, proved 
independently by Abramson and Ifarrington (1978) and NeSetiil and Rédl 
(1977b). We shall state this theorem after introducing the necessary notions: 

A type is a sequence (n,;6 € A) of positive integers. A type will be fixed. A 
structure (set system) of type A is a pair (X, 4) where: 

(1) X is a linearly ordered set (this ordering we call standard); 

(2) = (M,;6 EA) and M,C (7) for each 6 € A. 

Given two structures (X, ) and (Y, WV), W = (N,;6 € A), a mapping f: X> Y 
is said to be an embedding if 

(1’) f is 1-1 and monotone with respect to standard orderings; 

(2') For every 6 € A and each subset MC X we have 


MEM, iff f(M)EA, . 


An isomorphism is an invertible embedding. 

(X, M) is a substructure of (Y, AN) iff the inclusion X CY is an embedding. 
Given two structures A and B denote by (4) the set of all substructures of B 
which are isomorphic to A. 

Denote by Soc(A) the class of all structures of type A together with substruc- 
tures. Then we have the following result obtained by NeSetfil and Rédl (1977b, 
1983a) and, independently, by Abramson and Harrington (1978). 


Theorem 5.3 (Structural Ramsey theorem). Soc(A) is a Ramsey class. 


Another way to view a structure (X, 4), = (M,;5 € A), is to consider a 
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structural coloring s, of the power set A(X) (i.c., a mapping sy: P(X) S) 
defined as follows: the color of a set M is the set of all 6 for which M E M,. 

Embeddings are then just monotone mappings which preserve structural 
coloring. Explicitly, if s, is a structural coloring of an ordered set X and sy is a 
structural coloring of an ordered set Y then f: X— Y is an embedding of (X, sy) 
into (Y,5,-) if f is monotone and I-! and for every set MCX, (2’) holds. The 
following diagram shows the embedding: : 


f 
X>Y 
P(X) > P(Y) 


sa \, fs 
S 


If two structures (specified by structural colorings) are isomorphic then we say 
that they have the same patiern. 

This definition of a structure by means of the structural coloring has some 
advantages, one of them being that a structure and its ‘complement’ induce (up 
to a permutation of labels) the same pattern. 

With this we can formulate the above theorem as follows. 


Theorem 5.4 (Structural Ramsey theorem). For every set X endowed with a 
structural coloring sy: P(X)— S and for every positive integer t there exists a set Y 


endowed with a structural coloring sj: P(Y)—>S such that for every t-coloring of 


P(Y) there exists a set X' such that 8, induces on X' the same pattern as the pattern 


of X, Sy, and moreover, the two subsets of Y' with the same pattern have the same 
color. 


Let us compare Theorem 5.4 with Ramsey’s theorem 1.1. By itcrating 
Ramsey’s theorem we can easily get the following result: For every ¢ and n there 
exists N such that for every coloring of the power set A([N]) by ¢ colors there 
exists a subset X of [N] of size n such that the color of a subset of X depends only 
on its size. Such a homogeneous set may be visualized by fig. 5.2. 


N 


Lr 
IS 
S==== 


Figure 5.2. Figure 5.3 
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The structural Ramsey theorem 5.4 may then be visualized by the following 
more modern, somewhat Kice-type (Klee 1923) picture shown in fig. 5.3. 

Theorems 5.1, 5.3 and 5.4 are called induced Ramsey-type theorems. For every 
structure with a valid Ramsey property we may consider an induced version (by 
enriching the objects by the addition of a structural coloring). All these problems 
have now been solved. For example, the induced theorem for Hales—Jewett has 
been established in Deuber ‘et al. (1982) and the induced theorem 
for parameter sets, vector and affine spaces has been established by Prémel 
(1985). 

The original proofs of these theorems were not easy. On the other hand as we 
shall see in the next section, the induced theorems present only the first step on 
the ladder of difficulty of Ramsey-type theorems. The general methods imply 
them all.... 


5.2. Restricted theorems 


We motivate this chapter by (the chronologically first) example of a restricted 
Ramsey problem (i.e., K,-free Ramsey graphs for the triangle). 

This problem was partially solved by Folkman (1970) and the full solution was 
obtained by NeSetril and Rédl (1975a). 


Theorem 5.5. Let G be a graph not containing a complete graph K,. Let t be a 
positive integer. Then there exists a graph H not containing a complete graph K, 
such that for every t-coloring of edges of H we get a subgraph isomorphic to G with 
all its edges in one of the classes of the partition. 


Results of this type are called restricted Ramsey-type theorems. One should 
note that this result implies the induced Ramsey theorem for graphs. This is a 
general comment: Restricted theorems imply induced theorems. 

We shall justify this claim by proving it for Theorem 5.5, for simplicity, say for 
k =3: Thus, let G be a graph without triangles. Find a graph / without triangles, 
with two vertices a, b of f such that [a, b| does not form an edge in /, and if we 
add the edge [a, b| to / then we get a triangle. 

It is easy to find such an /, ¢.g., a path of length 2 will do. 

Now build a graph G’ by amalgamating a copy of J to every pair x, y of 
independent points of G in such a way that x is identified with a and y with b. 


id I 


Figure 5.4. 


See the schematic in fig. 5.4. Obviously G’ does not contain a triangle. 
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It is easy to sce that if H is a triangle-free graph which for every coloring of its 
edges contains a subgraph isomorphic to G' then also 


H>(G)° 


in the class of all graphs and (induced) subgraphs. 

This leads to the following definition: Let Forb(K,) be the class of all finite 
graphs which do not contain K, with induced subgraphs as subobjects. (Thus 
Forb(K,) is a subclass of Gra.) 

For classes Forb(K,), the solution of the Ramsey problem is provided by the 
following result of NeSetfil and Rédl (1975a). 


Theorem 5.6. For every k > 1, the class Forb(K,) has the A-Ramsey property iff A 
is either discrete or a complete graph with <k vertices. 


In one direction (necessity) one uses an appropriate version of the Ordering 
lemma. The sufficiency is harder of course. We shall discuss the relevant proof 
methods in section 7. 

But the development has been rapid and it follows from Rodd! and Rucinski 
(1995) that for any choice k, t, almost all graphs G, with n vertices which do not 
contain K,,, are Ramsey for K,: G,—(K,)°. 

In view of the previous section it is expected that one can prove a more general 
statement related to subclasses of Soc(A). We shall do so by means of the 
following definitions. 

A structure A =(X, 4), “= (M,;6 © A), (of type A), is said to be irreducible 
if for every pair x, y CX, there exists 5 € A and MEM, such that x, yEM. 

Let ¥ be a (possibly infinite) set of structures (of type A). Denote by Forb,(¥) 
the class of all structures A (of type A) which do not contain any member of ¥ as 
a substructure. 


Now we can formulate the principal result for set structures due to NeSetiil and 
Rodl (1977b, 1983a). 


Theorem 5.7 (Ramsey classes of structures). Let A be a type. Let ¥ be a (possibly 
infinite) set of irreducible structures (of type A). Then Forb,(#) is a Ramsey class. 


This is a generalization of the previous theorems. For example, by taking 
A= {2}, F={K,} we get (a nontrivial) generalization of Theorem 5.1 to 
partitions of ordered subgraphs. 

As we shall see now, this is in fact as far as we can go on this level of generality. 
Let us be more specific and let us analyse the Ramsey classes of graphs in more 
detail. 

First, in discussing Ramsey classes we can restrict ourselves to hereditary 
classes. For nonhereditary classes we have in general no hope for a characteriza- 
tion: any sequence G,, Gp, G,... with G,,, > (G,){*"' will form a Ramsey class. 
Also in view of the Ordering lemma we have to deal with ordered graphs. 
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If K is a Ramscy class (of structures of type 4) and we permute the colors of 
structural colorings we get a Ramsey class again. Particularly, if K is a Ramsey 
class of graphs and K denotes the class of all complements of graphs from K, then 
K is again a Ramsey class of graphs. 

If K and K' are hereditary Ramsey classes then. the class K UK’ of all 
structures which belong either to K or to K’ is clearly a Ramsey class again. 

We have seen that the class Compl! is a Ramscy class. By a simple ‘‘product” 
argument it follows that the class Eq of all-equivalences (i.e., of disjoint unions of 
complete graphs with a lincarly ordered set of vertices) is a Ramsey class. By 
the above remark also the class Eq is also a Ramscy class of graphs. The class Eq 
consists of all complete multipartite graphs, i.c., Turdn graphs. 

A bit surprisingly, these remarks and the previously stated theorems exhaust all 
Ramsey classes of graphs. 


Theorem 5.8 (Characterization of Ramsey classes of graphs). For a hereditary 
class K of graphs the following two statements are equivalent: 

(1) The class of all ordered graphs from K is a Ramsey class of graphs; 

(2) K is a union of classes K,;; i€ 1, each of which is either a class Forb(K,) or 
Forb(K,) (i.e., the class {G;G € Forb(K,)}), or the class Eq, or the class Eq. 


This result was proved in NeSetil (1989). The proof uses a deep result of 
Lachlan and Woodrow (1980) on graph-amalgamations. 

For types A of different form, i.e., 4% {2} (particularly for A= {3}) a 
characterization of Ramsey classes of structures is not known. On the other hand 
no infinite Ramsey class different from the Forb,(#) above is known. Perhaps 
even no essentially different Ramsey classes exist. 

Various simplified proofs of the main result of this section are known, see, e.g., 
Nesetiil and Rodl (1981, 1982, 1987b). These results are mostly based on the 
amalgamation technique, known as “partite construction”, which is due to 
NeSet?il and R6dl (1981, 1982). We give a sample of this method in section 7. 
Also, restricted Ramsey-type theorems were extended to all other Ramsey classes 
of structures, such as parameter sets or spaces. This has been done in Frankl et al. 
(1987), NeSetfil and R6dl (1987b) and Prémel and Voigt (1988). 

In section 7 we are going to prove a structural Ramsey theorem (Theorem 5.3) 
and its forbidden version (Theorem 5.7). 


6. The Erdos—Ramsey problem 


What is the structure of graphs G which arrow the triangle (i.e., which graphs G 
satisfy G—> (K,)*? for a fixed £)? 
Which classes of graphs have the edge-Ramsey property? 
Which graphs have to be contained in every graph G which arrows the triangle? 
For which sets ¥ does the class Forb(#) have the edge-Ramsey property? 
These problems and their analogues for other classes may be regarded as the 
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central questions of structural extensions of Ramsey’s theorem. Viewing the 
difficulties which one encounters even when proving the basic Ramsey-type 
theorems, these problems may be regarded as too ambitious. In fact, they were 
never formulated (at feast explicitly) on this level of generality since one could not 
solve even particular cases (such as the above triangle case). However, special 
cases have been asked by Erdés, Hajnal, Galvin in the late 60s. During the 
development of the field, many particular questions were related to the above 
problems. We propose to call the above problems Erdés—Ramsey problems as 
Erdés’ persistence has been largely responsible for the development of this area. 

In order to formulate and explain our approach it appears that we have to 
investigate Ramsey properties in the context of chromatic number. 


6.1. Ramsey properties via chromatic number 


In a similar way as one can view most extremal problems as questions concerned 
with the independence number of a (special) hypergraph one can relate Ramsey- 
type theorems to the chromatic number of a (special) hypergraph. This can be 
done in full generality for a class K endowed with subobjects (as introduced in the 
previous section) by means of the following construct. 

Given three objects A, B, C of K denote by (A, B, C) the hypergraph (X, 4) 
defined as follows: 


x=(§). 
a={(4):2'e(§)}. 


(A, B,C) is sometimes called the triangle-hypergraph since it arises by consider- 
ing the scheme 


BoC 


tA (6.1) 
A 


Recall that the chromatic number y(H) of a hypergraph H is the minimum 
number of colors which suffice for a coloring of vertices in such a way that no 
monochromatic edge occurs, and the independence number a(/7) is the maximum 
number of vertices which do not contain any edge. 

If we compare the corresponding definitions then we immediately obtain the 
following statement valid for every class K. 


Proposition 6.1. For every choice of objects A, B, C, of K and every positive 
integer t we have 


C>(B)i iff x((A,B,C)) >t. 
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This observation may serve as one of the possible approaches to Ramscy-type 
statements. 


This formulation may be convenient in other contexts as well. Let us give some 
examples. 

(1) We say that an A-density theorem holds (in a class K) if for every « > 0 and 
every BE K there exists CE K with 


(4) 
A 
The validity of an A-density theorem a/ways implies the A-Ramsey property since 


ICI 
a((A, B,C)) 


a((A,B,C))<e 


x((A, B, C)) > 


(2) The above reformulations of Ramsey and extremal (Turan) problems 
indicate that one could study these problems in a unified way. In fact, this has 
been done and so-called Ramsey—Turdn problems were studied in a series of 
papers by Erdés, Sés and others, see, e.g., Erdés et al. (1993) and Erdos and Sos 
(1982). 

(3) One can study the quantitative form of Ramsey-type theorems: Given a 
coloring c: (¢)—> [rt] denote by (§), the size of the subset of (§) formed by all 
c-homogeneous copies of B. Thus C>(B)/ iff (¢), >0 for every coloring c. 

It appears that much more is truc: all known Ramsey classes contain objects for 


which 
GEG) (6.2) 


where € is a positive constant depending on A, B, ¢ (and independent of C). 
Moreover, we know that for all full Ramsey theorems, every sufficiently large 
object has property (6.2). 
Such results were obtained in a series of papers by Frankl et al. (1988, 1989). 
(The basic idea can be illustrated on Ramscy’s theorem itself: Given n, p, t, set 
r=r(p,t,n) and consider R>r. Then obviously 


R R R-n\_(r\7'.(R 
2G Ges) = CG) a) 

(4) As we have seen, Ramsey’s theorem corresponds to the chromatic number 
of the triangle hypergraph (A,B,C) which in turn, corresponds to diagram 
(6.1). The dual Ramsey theorem then corresponds to the following diagram: 

B<C 


4 (6.3) 
A 


Let us return to our main theme. There is no hope (at least at present) to use 
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Proposition 6.1. for a proof of a Ramscy-type statement which is not related to 
partitions of singletons. The difficulty lies in the lack of techniques for “forcing” a 
large chromatic number: The triple hypergraphs (A, B,C) have many special 
properties which are difficult (and, in full generality, probably impossible) to 
control. 

However, the triangle hypergraphs (A, B, C) may be used rather efficiently for 
an analysis of the necessary conditions which Ramsey objects have to satisfy. 

In fact this approach appears to be the unique tool for characterization of 
natural local obstacles for Ramsey objects. This is based on the following. 


Folklore lemma 6.2. Let H =(V,E) be a k-uniform hypergraph with an infinite 
chromatic number. Then H contains every finite k-tree. 


An example of a k-tree is depicted in fig. 6.1 and we may also define k-trees 
recursively by means of amalgamations. A set S of ¢ edges which pairwise 
intersect in a single vertex x is called a t-star S; x is the center of S. 


Proof. We start with the following. 


Claim. Let H be a hypergraph with chromatic number t+ 1. Then H contains a 
t-star. 


Proof of Claim. Put 7 =(V, E). Let V=V,U--- UV,,, be a coloring such that V, 
is a maximal independent set in H —V,, etc. It follows that for every v EV... 
every set V, U {v}, V, U {v},...,V,U {v} contains an edge of H. These sets form 
a t-star centered atv. O 


Using the claim, we prove the Folklore lemma quite easily. Fix a finite k-tree T 
with n vertices. Set ¢= n°. We prove that every finite hypergraph H = (V, E) with 
x(H) >t contains T. Define recursively subsets V,, V,,...,V, of Vas follows: Set 
V, =V and let V,,, be the set of all vertices of the hypergraph H, = H|V, which 
are centers of an v-star in H,. It follows from the claim that xt, )>t-n and, 
generally, ae ~ in. Consequently, V, #%. Fix x, EV, and let the edges of 


H,_,e.',...,e"°' form an n-star at x,. Each vertex x of LU", e” ' is the 


Figure 6.1. 
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center of an n-star in H,,.,, etc. Some of these edges may then be used for a 
construction of a tree isomorphic to T. 


The importance of the Folklore lemma is that it characterizes all finite 
unavoidable subgraphs of highly chromatic graphs. 


Theorem 6.3. For a finite k-uniform hypergraph F, the following two statements 
are equivalent: 

(1) F is a subgraph of every k-uniform hypergraph H with infinite chromatic 
number; 

(2) F is a k-tree. 


(in (1) one can assume that F is a subgraph of every k-uniform hypergraph 
with a sufficiently large finite chromatic number.) 

Theorem 6.3 is a profound result. Its nontrivial part (1)=>> (2) asserts that for 
every choice of positive integers ¢ and /, there exists a k-uniform hypergraph H 
with the following properties: 

(i) x(H) >t, 

Gi) gH) > 1, 
where g(H) denotes the girth of H (alternatively, this means that H does not 
contain cycles of length </). 

This is a combinatorial classic which started in the 40s with Tutte (1954) and 
Zykov (1949) for the case k =2, /=3. The general case k =2 was solved by 
Erdés (1959) in his seminal paper by a striking application of the probabilistic 
method. The same method has been modified in Erd6s and Hajnal (1966) to yield 
the general result. 

For many reasons it is desirable to have a constructive proof of Theorem 6.3. 
This appeared to be difficult and a construction in full generality was finally given 
by Lovasz (1968). A simplified construction has been found in the context of 
Ramsey theory by NeSetfil and R6dl (1979a). The graphs and hypergraphs with 
the above properties (i), (ii) are called highly chromatic (locally) sparse graphs, 
for short. 

Their existence could be regarded as one of the true paradoxes of finite set 
theory and it has always been felt that this result is one of the central results in 
combinatorics. 

Recently it has been realized that sparse and complex graphs may be used in 
theoretical computer science for the design of fast algorithms. However, what is 
needed there is not only a construction of these “paradoxical” structures but also 
their reasonable size. In one of the most striking recent developments, a program 
for constructing complex sparse graphs has been successfully carried out. Using 
several highly ingenious constructions which combine algebraic and topological 
methods it has been shown that there are complex sparse graphs, the size of which 
in several instances improves the size of random objects. See Margulis (1975), 
Alon (1986), Lubotzky ct al. (1988) and chapter 32. 

Particularly, it follows from Lubotzky ct al. (1988) that there are examples of 
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graphs with girth /, chromatic number ¢ and the size at most ¢ (the Erdés 
probabilistic bound being ¢’). Prior to this construction, not even a primitive 
recursive upper bound was known. But not everything in this direction is 
presently known. Below we shall see that for Ramsey structures this is still an 
open problem. Also, a bit surprisingly, the following is still open. 


Problem. Find a primitive recursive construction of highly chromatic locally 
sparse p-uniform hypergraphs. Indecd, even triple systems (i.e., p =3) present a 
problem. 


Another implication of Theorem 6.3 is the following result (NeSetfil and Rédl 
1966). 


Corollary 6.4. Let A be a finite set of 2-connected graphs. Then the class Forb(A) 
has the vertex-partition property. 


As indicated above (sce remarks before Theorem 5.8) if the class K has the 
vertex-partition property then the class —~K (formed by complements of graphs 
from K) and the class + K (formed by disjoint unions of graphs from K) have the 
vertex-partition property as well. These constructions may be iterated (to yield 
classes + — + — + —K). Then Corollary 6.4 together with the union of the classes 
+ —+-+++— (Compl.) are the only known classes of the form Forb(A) which have 
the vertex-partition property. The problem is not yet completely solved, see RGdl 
and Sauer (1992), and Rodl et al. (1995). 


6.2. Ramsey graphs 


One can view Theorem 6.3 as the solution of the Ramsey problem for partitions 
of singletons (of set systems). For other Ramsey type questions, the situation is 
less transparent and progress has been slow. Since this is not the distant past in 
which theorems melt into definitions, history is vital and we should review it. Of 
course, the Folklore lemma 6.2 gives us a good candidate for the solution of the 
Erd6s—Ramsey problem. But this was not the real trajectory of discovery. [t took 
some time before the pattern of the problem was recognized. 

Let us illustrate the main theorem of this chapter again by the particular 
example of edge-Ramsey graphs for triangles (compare the introduction to section 
5). 

Thus, let us study those graphs G which have the edge-Ramsey property for the 
triangle K,; explicitly, we study G with G— (K,);. Since we are interested in the 
structural properties of G, we have to assume that ¢ is sufficiently large (to avoid 
singular small examples). By the restricted Ramsey theorem 5.5 we know that G 
may be chosen K,-free and that the class of all K,-free graphs is in fact a Ramsey 
class (by Theorem 5.7). According to Theorem 5.8 we know that on this level of 
generality we have no other results. However, we are interested in the edge- 
colorings only and thus we can hope for stronger results. 
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The first step was taken. by Spencer (1975b). He showed that for every choice 

of positive integers ¢ and /, there exists a graph G and a system J C (F) such that: 
(i) every TE J corresponds to a triangle in G; 

(ii) the hypergraph (£, 7) has no cycles of length <J; 

(ili) y(E, 7) >t and thus J is a Ramsey family of triangles in G. 

We call this (and the like) result, a sparse Ramsey family (of copies of A in B). 

One can prove the existence of sparse Ramsey families for every instance of the 
full Ramsey theorem since we may use probabilistic means (in the spirit of Erdés’ 
1959 original ideas; see Spencer 1975b for details). 

However, this does not imply restricted theorems since one constructs J for 
essentially a complete graph G. The situation is less clear and more in the spirit of 
restricted theorems if we demand in the above result that 7 is the triple system 
induced by a// triangles in G and yet the conditions (ii) and (iii) above are 
satisfied. This has been posed as a problem in Spencer (1975b) and solved 
affirmatively in NeSetfil and Rodl (4984, 1989), Again one can prove the 
analogous statement for every class with a full Ramsey theorem. However, until 
recently this presented a problem for structures different from set systems, 
especially for parameter sets and spaces. Recently, however, all of these problems 
were resolved, We wish to single out the following result which has been obtained 
by Rodi (1986a) and NeSetiil and R6d! (1987b). 


Theorem 6.5 (Sparse Hales—Jewett theorem). Ler t, | be positive integers, A a 
non-empty set. Then there exists an N and a set XC A" with the following 
properties: 

(1) For every t-coloring of X there exists a monochromatic line in X; we denote 
these by X>(1)!. 

(2) All lines in X (i.e., (*)) do not form cycles of length <1; explicitly the 
hypergraph (X, (*) has girth 2 1. 


Actually, this is a key result from which one can derive other results of this type 
quite easily, see Nesetiil and Rédl (1989, 1990b), and compare also section 7. _ 

One is tempted to call these and the like results sparse Ramsey-type theorems. 
However, this is misleading. A more appropriate name would perhaps be 
Ramsey-type theorems for sparse copies. For nonsingleton partitions there is still a 
long way to go. 

Let us illustrate this again on our example of a Ramsey graph G for the triangle K,. 

A Ramsey theorem for sparse copies of K, claims that we can find G with 
G->(K;); and yet G does not contain pictures like those depicted in fig. 6.2. 
These graphs reflect examples of cycles in edges. 


Figure 6.2. 
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No information is available about subgraphs whose cycle structure is not 
induced by edges such as those depicted in fig. 6.3. 


Fa 


ot i 
Noy 
Figure 6.3. 


The situation is a bit confusing since every graph G with G—>(K,)3 has to 
contain a C,. However, there is no reason why it should contain a chordless Cy. 
In this way it became apparent at the end of the 70s (see, e.g., NeSetiil and 
Rédi 1979b) that the following peculiar problem plays a central role: Does there 


exist for every bipartite graph G, a bipartite graph H with the following 
properties: 


(i) H>(G);; 


6.4 
(ii) H has the same girth as G. co) 


However, (6.4) appeared to be just as difficult as the Erd6s-~Ramsey problem 
and, in fact, Erdés (1975), motivated by the strict finiteness of this problem 
conjectured that the answer is negative already for girth 5 (i.e., for rectangle-free 
bipartite graphs). ; 

This particular case (i.e., of girth 5) of (6.4) has been solved by NeSetiril and 
Rédl in (1987a). Their method does not generalize to girth >6. However, it is 
related to the following result which is of independent interest. 


Theorem 6.6 (Ramscy theorem for simple systems). Let ¥, denote the class of all 
k-uniform hypergraphs which are simple (i.e., every two edges meet in at most one 
point). Then the distinct class £, has the edge-Ramsey property. 

Explicitly: for every BE &,, there exists CEL, such that C->(B), where e 
stands for the single-edge hypergraph. 


By a theorem of Wilson (sec chapter 14) this leads to the following. 


Corollary 6.7 (Ramscy theorem for Steiner systems). Let ¥, denote the class of ail 
Steiner k-systems. Then F, has the edge-Ramsey property. 


(One can easily see that Corollary 6.7 does not hold for block designs with 
AF 1.) 

These results were essentially the first results which went beyond the classes 
Forb(¥). Only very recently was the above problem (6.4) solved affirmatively by 
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Neéetiil and R6édl. This result is not yet published. As indicated above, in 
particular, this yields a solution of the Erd6s—Ramscy problem for triangles. 


7. Techniques 


Ramsey theory benefits from most combinatorial (and fortunately also non- 
combinatorial) methods. Perhaps the very fact that many different tools were 
applied in order to solve particular problems essentially contributed to the 
popularity of the ficld. This diversity is well documented, e.g., by Erdos et al. 
(1984a), Graham (1981), Graham et al. (1980), Grinstead and Roberts (1982) 
and Nesetil and Rédl (1979b), and most recently by a collection of papers in 
NeSetfil and Rédl (1990a). 

In this paper we followed the mainstream of Ramsey theory formed by efforts 
to prove the strongest structural Ramscy-type results. Even in this limited scope 
we had to omit several interesting and very active areas. Some of them were 
mentioned earlicr. Let us complement this by mentioning that we left out 
structural extensions of the canonical lemma of Erdés and Rado (1950). These 
extensions were studied in great detail by Prémel and Voigt; see, e.g., Prémel and 
Voigt (1985). Also we had to omit various applications of Ramscy theory. 

Let us finish this paper by giving a sample of the proof technique which proved 
to be quite useful in this area. This is the amalgamation technique duc to NeSetiil 
and Rédl. 

The method originated in 1976 from the analysis of methods of NeSetiil and 
Rédl (1977b), e.g., NeSetiil and Rod! 1981, 1982, 1987b for examples of its use). 

We shall illustrate this by giving a short proof of the structural Ramsey theorem 
5.3. 

It is an important feature of the amalgamation technique that it is self-refining 
and consequently we can derive a number of corollaries such as Theorems 5.7 and 
6.5. 

The amalgamation technique consists of three basic steps: 

(i) Definition of partite systems and their amalgamation. 

(ii) The partite lemma. 

(iii) The partite construction. 

All versions of the amalgamation technique have this structure. The following 
proof of the structural Ramsey theorem and Ramsey classes of structures follows 
NeSetiil and Rédl (1989). 


7.1. Partite systems (of type A) 
An a-partite system A is a pair ((X,)/_,, 44) where 


iol 
(a) X= Ui_, X, is an ordered set satisfying X,<X,<--- <X,, 
(b) = (M556 EA), M; CG). 
(c) |MNX,| <1 for every ME.M,,i=1,...,a,5 EA. 
The sets X; are called parts of A, and elements of 4 are called edges of A. 


Property (c) implies that edges are transversals with respect to the family 
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Figure 7.1. 


X,< +++ <X,. Given a subset Y CX, we denote by tr(Y) the trace of Y, i.e., the 
set (i: YO X, #9}. See fig. 7.1. 
ea is called a transversal if |X,|=1 for every i=1, ,a. A is a subsystem of 
=((Y,)?_,,) if there exists a monotone injection | i: oH. ay {1,...,5} 
a that X, CY, for i= 1, ,aand HM, =N4,9(~ ,) for 6 EA. An isomorphism 
is defined as an order- and nant preserving isomorphism. A subsystem of B, 
isomorphic to A is called a copy of A in B, the set of such copies is denoted 
(again) by (4). 


7.2. The partite lemma 


Lemma 7.1. Let t be a positive integer and let A and B be a-partite systems. 
Moreover, let A be a transversal. Then there exists an a-partite system C such that 


C>(B); 


Here the arrow notation has the same meaning as the one above (with om) 
being the sct of all copies of B in C, i.e., as a-partite subsystems). 


Proof. Set A = ((X,);_,,. 4), B=((Y,)/_,, 4). Since A is a transversal we may 
suppose without loss of generality that (J,..4.4, is the set of all subsets of X 
(this may be achieved by adding “dummy” edges to 4 and WN). Without loss of 
generality we may also assume that every vertex y € Y is contained in a copy of A; 
this is a general comment: if B* is the subsystem of B induced by (%) and 
Cc*> B*); then we may easily construct C such that C-> (B)% by enlarging every 
B* €(;.) to a system B. 

Now take WN to be sufficiently large (indeed, very large) number. Define an 
a-partite system C = ((Z,)/_,, 0), © = (©,; 6 € A) as follows: Z;= Y, x --- x Y,(N 
times); i.e., each element of Z; has the form (x,;:x,EY,, f=1,...,N). 

Set Z=Ji_, Z, and for j=1,...,N, define the projection a,:Z->Y by 
n(x, k=t,...,N) =x). 

Clearly 7 maps Z, into Y,. 

We define © = (6,; 6 € A) as follows: First put W,=4, UN where W, is the 
set of all edges of n which belong to a copy of A in B, NW, =, ~ V5 (note that in 
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general we cannot assume Vv, = 9). We put 
{Qf f=1,...,N)ck=1,..., ns} 6, 


if tr({xiik = | eee Ms}) = tr({xp:k = 1,...,,}) for all 7, 7) <N, and one of 
the following possibilitics occurs: 
(1) (xf: K=1,...,n3}EN5 for every j=1,...,N. 
(2) There exists a non-empty set AC {1,...,N} such that 
(hik=1,...,m}= Oh ka1,. mg} ENG 
for all j, j’) EA and 
{xf:k=1,...,ms}EN! forall jZA. 


(Note that, in general, 7 #5; however, n is uniquely determined by tr( (xj: k= 
| eee) ee 

We shall prove C—>(B)/ provided N is large enough. This easily follows from 
the two facts stated below as Fact 1 and 2. 


Fact 1. A’ E(‘) iff a( AVE (2) for every j=1,...,N. 
Proof. Obvious by the definition of ©. 0 


Set (2) ={A,,...,A,}. Set R= {1,...,r}. Think of the set R* endowed with 
Hales—Jewett (combinatorial) lines. A line is a set L of the following form: Fix 
AC{1,...,N} (non-empty) and a" =(a),...,a@,) ER™ and set 


L={(a,,...,@y):a, =a" for i€ A, a, =a, fori, jE A}. 


Clearly |L] =r. Given a = (a,,....ay) ER, denote by V(a) the set of all 
vertices x of Z which satisfy m(x)EA,,. Set ViL)=Ue, Vw). By virtue 
of Fact 1, the set ({) is in 1-1 correspondence with the sct R”. 


Fact 2. Let L be a line of R™. Then V(L) induces a copy of B in C. 
Proof. Check the definition of C. O 


Now we invoke the Hales—Jewett theorem 2.2 and choose N sufficiently large 
so that for every partition of R” into ¢ classes, one of the classes contains a 
monochromatic line. This implies C—> (B)4. Indeed, let (4) =f, U «-- U& bea 
partition. By Fact | this induces a partition RY =.) U --- Uf), by defining 
a€ A, iff V(a) induces a copy belonging to »,. By the Hales—Jewett theorem 
there exists a monochromatic line L in R™. This in turn, using Fact 2, yields 
B' & (5) such that (4) is monochromatic. 0 


7.3. The partite construction 


In this part we prove the structural Ramsey theorem by means of the partite 
construction. We follow closely the construction given in NeSetril and Rédl (1981, 
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1982). Let k, A, B be fixed. Consider A as a transversal a-partite system and B as 
a transversal b-partite system. Set explicitly 


B=({Yis---5 Yo} M). 
Set p =r(a, t, b) = min{n: n— (b6)/} (the classical Ramsey number). Set q= = bys 
(e521) = (M',...,M}. We shall construct “pictures” P®,...,P*,..., P4 
by induction on k. Picture P* will be the desired system C. 
Let P = (X79; :-1,0) be a p- partite system where for each choice of b parts 
Xx a ...,X¢, the subsystem of P” induced by them contains a copy of B. Such a 


“picture” P’ may be formed asa disjoint union of copies of B. 

Suppose picture P* = ((X*)?_ or ), k<q, is given. Consider M**' and the 
a-partite system D**? induced in P* by parts x where each y, belongs to M‘*'” 
By the partite lemma there exists an a-partite system E**' such that 

E**'-5(p*"')4 . 
Extend each copy of D*”’ in E**' to a copy of P* in such a way that the distinct 
copies of P* intersect only in vertices of E“*'. In this amalgamation the parts of 
distinct copies of P* are preserved. 

Denote the resulting amalgamation by (iss) «P*, (For a more explicit 
definition, sce NeSetiil and Rédl (1984). Set P**! = (Et) pA (see fig. 7.2). 


pil 


Finally, put C =P’. We claim that C has the desired properties. 
Claim 1. Every irreducible subsystem in C is a subsystem of B. 


Proof. Induction on k. This being trivial for k=0, in the inductive step the 
amalgamation does not create any new irreducible system. O 


Claim 2. C>(B)*. 
Proof. Backward induction on k=q, q-1,...,1. In the inductive step (k + 


1)—>k we apply the partite lemma and find a copy of P* such that all copies of A 
with trace M° are monochromatic. This leaves us, for k =0, with a copy P’ of P° 


Figure 7,2. 
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in C where the color of a copy of A in P depends only on its trace. However, such 
a copy of P® contains a monochromatic copy of B by the construction of P” and 
the assumption that p> (b)’. O 


Proof of Theorem 6.5 (Ramscy theorem for simple hypergraphs). Note that 
transversal edges intersect in at most one vertex and that lines form a simple 
hypergraph. Consequently, for two. different lines L and L’, the sets V(L) and 
V(L') intersect either in a single vertex or a single edge (see the above proof of 
the partite lemma). Cl 


Only recently has it been discovered that one can apply the amalgamation 
technique to ‘algebraic’? Ramsey-type theorems (such as the Van der Wacrden 
and vector space theorem). This has been done in Frank] et al. (1987), NeSetril 
and R6édl (1987b) and Prémel and Voigt (1988). Particularly, Frankl et al. (1987) 
contains an induced and restricted space theorem, (i.e., the analogues of 
Theorems 5.1 and 5.5 for spaces) and NeSetril and Rédl (1987b) contains a 
structural space theorem and a Ramsey classes of structures-theorem (i.c., 
analogues of Theorems 5.4 and 5.7 for spaces). Let us finally remark that once a 
proper amalgamation pattern has been realized one can proceed in complete 
analogy with the methods described in this section. To see this one should 
compare the papers NeSetril and Rédl (1989, 1990b). 
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1. Introduction 


The concept of uniformly distributed sequences and sets plays a fundamental role 
in many branches of mathematics (measure theory, ergodic theory, diophantine 
approximation, mathematical statistics, discrete geometry, numerical integration, 
etc.) This chapter explores the combinatorial background of many of these 
results. See also the survey article of Sds (1983b), and the monograph by Beck 
and Chen (1987). 

Measure theoretic discrepancy results are accumulated in two complementary 
chapters of number theory, called uniform distribution and irregularities of 
distribution. The object of these theories is to measure the uniformity (or 
non-uniformity) of sequences and point distributions. For instance: how uniform- 
ly can N points in the unit cube be distributed relative to a given family of “nice” 
sets (¢.g., boxes with sides paralle! to the coordinate axes, rotated boxes, balls, all 
convex sets, etc.). The theory was initiated by the following theorem of 
Aardenne-Ehrenfest (Van der Corput’s conjecture): for every infinite sequence of 
reals in [0, 1] and for every k > 0, there exists a beginning section (v,,... , x,,) of 
the sequence and a subinterval (@, B) such that the number of elements of this 
beginning section in this subinterval differs from a(B—a) (the number one 
expects) by at least k. The best possible effective result on this problem is due to 
Schmidt; it is equivalent to the following basic result in the theory of uniform 
distribution. 


Theorem 1.1 (Schmidt 1972). Let P be an arbitrary set of N points in the unit 
square [0,1)°. Then there exists a rectangle B C[0,1)° with sides parallel to the 
coordinate axes such that 


|PO B| — Narea(B)| >c log N 
(where c is an absolute constant). 


The left-hand side of this inequality measures the ‘discrepancy’ (deviation 
from the uniform distribution) of P in B. As a fascinating fact, we mention that 
balls have much greater discrepancy than boxes with sides parallel to the axes. 
Now we have a good understanding of this phenomenon, as we shall see later. 

The object of combinatorial discrepancy theory is to color a set with two or 
more colors so that cach set in a given family be colored as uniformly as possible. 
As a beautiful example, we mention Roth’s theorem on long arithmetic pro- 
gressions. 


Theorem 1.2 (Roth 1964). For any partition of the integers 1, 2,...,N into two 
sets S, and S,, there exists an arithmetic progression P = {a,a+d,...,a+ kd} C 
{1,2,...,N} such that 


IPAS,|-|PNS,||>4,N'". 
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It took more than a decade to realize the close relationship between these 
areas. We can now say that they represent the continuous and the discrete aspects 
of the very same coherent theory. A general form of these problems is the 
following: given a measure space, approximate the measure on a subfamily of the 
measurable sets by a measure where cach point has measure 0 or 1. Nontrivial 
“transference theorems” help to transform combinatorial and measure theoretic 
results into each other. ; 

Compare Roth's theorem also to the following fundamental results of Ramsey 
theory (see chapter 25). 


Theorem 1.3 (Van der Waerdcn 1927). For any integers k and r there exists an 
W(k,r) such that if N > W(k,r) then for every r-coloring of {1,2,...,N)} there 
exists a monochromatic arithmetic progression of length k. 


Theorem 1.4 (Ramsey 1930). For any integers t and r there exists an R(t, r) such 
that if n>R(t,r) and the edges of K, are r-colored, then there must be a 
monochromatic K,. 


These theorems have the same structure as Roth’s: given an underlying set S 
and a family of subsets of this set, the claim is that the underlying set has no 
partition which splits each set contained in the given family “reasonably well” 
(only in this case any proper splitting is accepted). 

Discarding the special structure of the system we can formulate the basic 
problem in combinatorial discrepancy theory. Let S = {x,,...,x,,} be a finite set 
and # = {A,,...,A,,}. a family of subsets of S. Our goal is to find a partition 
S=S5,NS,, $,S, =9 that splits each set in the family # as equally as possible. 
In other words, we want to find the least integer D for which there exists a 
2-coloring of the underlying set such that in each A,, the difference between the 
numbers of red and blue elements is at most D. 

Often we shall describe the partition by a function f: S—> {—1,1}. Then the 
discrepancy of # is defined by 


’ 


RH) eanily max 


sjsm )y 


dX fle) 
iEA; 
where the minimum is taken over all functions f: S— {—1, 1}. 


Best and worst families. Although the systematic investigation of combinatorial 
discrepancy started just a few years ago, there is a fundamental old result which 
characterizes the “best’’ families, those for which B(#) <1, and this is inherited 
to subhypergraphs. These are the unimodular hypergraphs, whose theory was 
developed for its importance in integer programming (see chapter 30). 

A hypergraph 9 is unimodular, if its incidence matrix A is totally unimodular 
(i.e., every square submatrix of A has determinant 0, +1 or —1). See chapters 7 
and 30 for examples of such hypergraphs; here we mention hypergraphs whose 
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edges are the vertex sets of directed paths in an arborescence. For X CS, the 
restriction Hy is defined as the family {AN X|AE #}. 


Theorem 1.5 (Ghouila-Houri 1962). # is unimodular iff D(H) <1 for all 
restrictions Hy of H. 


Unimodular hypergraphs have the following stronger property. 


Theorem 1.6. If #€=(V,E) is unimodular then for any p©[-1,1]” there exist 
e€{-l, 1y" such that for every AGE, 


2 (,-p)|= I. 
i€A 

Informally, an arbitrary weight distribution on § can be very well approximated 
with 0-1 weights. 

Furthermore, we have the following. 


Theorem 1.7. If # is unimodular, then for every r> 1 there exists an r-equiparti- 
tion S=§,U +--+: US, so that for every AE # and 1 <j <r, 


[4] <lansi<[4]. 
r r 


The “worst” families from the point of view of discrepancy are the ‘“‘non-2- 
colorable families”, i.e., families with chromatic number y(#) > 2. (Recall from 
chapter 7 that a hypergraph is non-2-colorable iff for any partition S=S,US, 
there exists an AG # so that ACS, or ACS,.) An r-uniform hypergraph is 
2-colorable if and only if its discrepancy is less than r. (Note that this remark also 
shows that the computation of the discrepancy of a hypergraph is NP-hard.) 

One of the most extensively studied field of combinatorics is Ramsey theory, 
which can be viewed as the theory of non-2-colorable families (see chapter 7). 
Many of the results and problems there are relevant to our subject. 

Considcring the results in Ramsey theory we must realize the white spots and 
gaps in discrepancy theory. A large variety of Ramsey-type results are available 
not only for graphs and hypergraphs but for different structures like vector 
spaces, combinatorial lines, parameter-sets, groups, euclidean spaces, topological 
spaces, sets of solutions of linear systems, etc. However, an analogous dis- 
crepancy theory is missing for most of these structures. 

We can say that in the class of hypergraphs unimodular families are at one (at 
the “‘good”) end and non-2-colorable families at the other (‘‘bad”’) end. We 
conclude this section with an example of A.J. Hoffman showing that the union of 
two unimodular (so best!) families can be non-2-colorable (so worst!). 


Example 1.8 (Hoffman 1987). Let T be an arbitrary arborescence rooted at r. Let 
#, consist of the arc-sets of directed paths in T from r to a leaf. Let #, consist of 
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the sets B(x), where B(x) is the set of edges with their tails at node x, for each 
non-leaf node x. Obviously #, and #, are unimodular, but #, U #, is not even 
2-colorable. (Note that we can choose the tree so that #, U #, is k-uniform for a 
given k.) : 


A very simple unimodular hypergraph is the hypergraph of all intervals in a 
permutation (a totally ordered set). How large can be the discrepancy of the 
union of such hypergraphs? For two permutations, the discrepancy is at most 2; 
but the following problem, due to Beck, has been open for quite a while. 


Probiem 1.9. Is it true that the hypergraph consisting of the intervals of three 
permutations of a set X has discrepancy O(1), independent of |X|? 


Recently Bohus (1990) gave the upper bound O(log |X|) for this discrepancy, 
not only for three, but for any constant number of permutations. 


2. Bounds on &(#) 


Many of the results in this section have applications in different fields. In fact, 
many of the problems originated in different branches of mathematics. 
There is a trivial upper bound on the combinatorial discrepancy: 


D(H) = max jA| . 


If # is k-uniform (i.e., |A| = for all A € #) then equality holds iff # is a not 
2-colorable. 

To bound the discrepancy in terms of the number of edges m =|#|, observe 
that a pair of vertices contained in the same set of edges can be deleted without 
decreasing the discrepancy. Repeating this we end up with a hypergraph in which 
every edge has at most 2” — 1 clements and hence 


D(H) <2" —1. 


This upper bound can be easily improved. The first result in this direction was the 
theorem of Olson and Spencer (1978) where they proved the upper bound 


Q(H)<cm'logm. 
The best possible result is the following. 
Theorem 2.1 (Spencer 1985). For every # with |#|=m 
D(H) <6m'"?. 


For a proof, which is an involved application of the probabilistic method, see 
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chapter 33. This result is best possible (up to a constant): if an Hadamard matrix 
of order m +1 exists, then there exists a hypergraph % with |9#|=m such that 
Y(H#) =4m'" (see Corollary 2.11). 

Spencer’s theorem has interesting applications in Fourier analysis to ‘““Rudin— 
Shapiro sequences” (sce Spencer 1985), and to Littlewood’s problem on “flat 
polynomials” (see Beck 1991b). 

It is somewhat surprising that there is an upper bound on 2(4) depending only 
on the maximum degree A(#) = max,es |{A € #: x € A} |. 


Theorem 2.2 (Beck and Fiala 1981). Let be a finite hypergraph. Then 
D(H) <2A(H) . 


In fact, we have the following more gencral result. 


Theorem 2.2’. Let us associate with every i€ S a real number p,€[—1, +1]. Then 
there exist &,€ {~-1, +1} (ES) such that 


max 
AEX 


yi («, ~p)) <2A(H) . 


i€ 


Proof. The key idea is to consider variables «, (i € S) lying anywhere in [—1, +1}. 
Initially €, = p,; all sets then have zero ‘“‘discrepancy”’. At the end each e, must be 
—1 or +1, providing the coloration in the theorem. We describe the procedure 
that is to be iterated to go from the initial trivial “coloration” to the final one. 

Suppose we have some current assignment ¢;. Call i fixed if ¢, = +1 and floating 
otherwise. Let A =[a,,] denote the incidence matrix of the family #%. Call row j 
ignored if dia, < A(#) (the sum over the floating /) and active otherwise. As 
each column sum ts at most A(#). there are fewer active rows than floating 
columns. Find y,, for each floating i, with )} a,,y, = 9 for each active row j. As this 
system is undetermined, there is a nonzero solution. Now replace £, by «&, + Ay; 
where A is chosen so that all e, remain in [—1, +1] and some floating ¢, becomes 
*1 (i.e., fixed). 

Itcrate the above procedure until all e, = +1. To see that the values obtained 
satisfy the requirement of the theorem, observe that a given row has zero 
“discrepancy” (i.c., Xa,,(e; — p;) = 9) until it becomes ignored. After that, cach e; 
still floating changes by at most 2 and hence the sum La, e; — p;) changes by less 
than 2A(#). 0 


Theorem 2.2 was motivated by the following ‘integer making lemma” (and in 
fact is a generalization of it). 


Lemma 2.3 (Baranyai 1974). Let A = (a,,) be a matrix of real elements. Then there 
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exist an integer matrix A* = (a;,) such that 
‘ tf 
la, —a5|<1 for alli, j, 
* 
b ay~ 2a; 
i t 
* 
> a5; — > ai 
af } 


<1 forallj, 


<1 foralli, 


and 


<1. 


IZ Say E Da} 


é 


This lemma was the basic tool in Baranyai’s theorem on the factorization of the 
complete uniform hypergraph (see chapters 7 and 14). The lemma can also be 
proved using the integrality theorem of flow theory (see chapter 2). 

We suspect that Theorem 2.2 can be essentially improved. The following 
conjecture would also generalize Spencer’s theorem 2.1. 


Conjecture 2.4 (Beck—Fiala). 
Q(H) < c(A(H))'” . 


If true then it is best possible apart from the constant factor c. Corollary 2.6 
below justifies the weaker conjecture 2(#) < (A(H))'’?** when both |S| and |x| 
are “subexponential” functions of the maximum degree. For later application, we 
state first a more general result. 


Theorem 2.5 (Beck 1981b). Let # be a finite hypergraph with LU #=S. Let M 
and K be natural numbers such that 


A({AE #:|A|=M}) SK. 
Then 
D(H) <c(M + K-log K)'?- (log|a|)'? - log] S| . 
Choosing M = 1 and K = A(#H), we obtain the following. 
Corollary 2.6. For any finite hypergraph with A= A(#), we have 
D(H) <c- A"? -log|H#|-log|S| . 


The following somewhat technical theorem, which is useful in applications, is a 
generalization of Corollary 2.6. 
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Theorem 2.7 (Beck 1988). Let # be a finite hypergraph with (J 3 = S. Suppose 
that there is a second family G of subsets of S such that 

(i) A(4) =D; and 

(ii) every A € #.can be represented as the disjoint union of at most K elements 
of G. Then 


B(H#) <c-((K - D- log D -log|#|)'’? - log|S| . 


Note that if = # then we obtain Corollary 2.6. 

We have to remark that there are very few general lower bounds on B(#). The 
following one is based on linear algebra. To state it in its natural generality, 
define the ¢,-discrepancy of a hypergraph # by 


%,(#)= min os ‘o «)) 


ee{-t,1}% 


Vid 


Clearly 2(3%)m |? < D(H) < B,(3). We denote by A,,,,(M) the least eigenvalue 
of the matrix M. We recall: |#| =m and |S|=n. 


Theorem 2.8 (Lovasz—T. Sds). Let M be the incidence matrix of #. Then 

(i) B,(9) > (MA, ,(M™M))"?, 

(ii) if for some diagonal matrix D, the matrix M'M ~ D is positive semidefinite, 
then G,(#) = (Tr D)''”. Note that T stands for transpose. 


Proof. Let f € {—1, 1}* attain the minimum in the definition of 9,(#). Then 


2 
aay = > (Df) =mp"mp =s'M "ay 


AEH ‘iEA 


=f" fr yin(M"M) =nA,,,,(M'M). 


This proves (i); the proof of (ii) is similar. © 
Corollary 2.9. If # has constant pair-degree, i.e., 
|{A:i, fEAEC HHA 


for every i, jE S, iA#j, and d, denotes the degree of iG S, then 
n 1/2 
Q(H) = a 0 (d,- »») F 
i=t 


Corollary 2.10. Let # be formed by the set of lines in a finite projective plane of 
order p. Then 


D(H) = Vp . 


Corollary 2.11. Let H be an n Xn Hadamard matrix, i.e.. a+ 1 matrix whose 
column vectors are mutually orthogonal and has all \s in the first row. Let # be the 
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hypergraph whose incidence matrix is obtained from H by replacing the —\s by 0s. 
Then 


a(%) > 


This corollary proves that Theorem 2.1 is best possible apart from the constant 
factor. 


The most important application of Theorem 2.8 is Roth’s theorem (Theorem 
1.2, see section 5). 


3. Various concepts of discrepancy 


Suppose we want to split the sets in # in ratio a, | ~ a. In other words, we want 
to find a system of representatives of # so that the number of representatives in 
every set AE # is as close to @|A| as possible. Then, setting A= 2a — 1, 


» (&- my) 


i€A 


AH, A)= min max 
eG{-1 ay’ AEX 


measures the corresponding discrepancy. Obviously 
BH; 3) = BH). 


More generally, we may consider a weight-function p: S—[—1,1] and the 
corresponding discrepancy 


D(H; p)= min max 


re(~Liyy 4” 


2 (e, - pa 
iA 
(this value has come up in Theorem 2.2'). The inhomogeneous discrepancy of 3 
is defined by 
D(H) =max BH, p) 
and measures how well an arbitrary weight distribution on S can be approximated 


with 0-1 measures regarding the family #. Considering the particular cases 
P, = ++: =p, =A we define the diagonal discrepancy by 


D(H) =max D(H; dr). 
The hereditary discrepancy of is defined by 
D, (9) sup D(H). 
-Ghouila-Houri’s theorem 1.5 asserts that a hypergraph is totally unimodular iff its 


hereditary discrepancy is at most I. 
Observe that adding new elements to some of the sets in # appropriately we 
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can achieve that this enlarged hypergraph will have discrepancy 0. This means, 
that 2(H#) can be small by accident, while %,(#) and B,,(#) depend on more 
intrinsic properties of #. In fact, @(#) can be much smaller then 9,(#) or 
%,(#). A simple example is the following. Let S = {1,...,4n} and 


H={A|ACS,|AN({1,...,2n}|=|A]/2}. 
Then 2(#) =0 but 9,(9) =n and 9, (#) =n. 


We mention the trivial inequalities 
D(H) < min{ D(H), DB, (H)} 
and 


D(H, )< BD, (H) <D(H) . 


The following nontrivial inequality was first explicitly formulated in Lovasz et 
al. (1986). The proof is identical with that of Lemma 3 in Beck and Spencer 
(1984b). 


Theorem 3.1. For every hypergraph #, 
D(H) <2D,(H) - 


Proof. Let, for each i€ S, a weight —1 <p, <1 be given. Let a, =(1+ p,)/2E 
(0, 1]. Assume first that all the a, have finite binary expansion, i.e., there is a 
natural number n so that 2"-a,€Z for all i€ S$. Let 2 be minimal with this 
property. Let X CS be the set of points i€ S such that a, has 1 for its nth binary 
digit. As D(#) <= 9,,(#), there exist e, = +1 for all i€ X such that 


> 


iG ANX 


S D(H) 


for all A € %. Define approximations a‘"’, a§",..., a)’ by 


ses 


t 


a,t+e-2" ifiEX, 
a; ifiES\X. 


For any AE #, 


= @\?-a)|- 


> 2" +e, 


tEANX 


<2"-G,(#). 


The values a‘’? have binary expansions of length at most (n — 1). We repeat 
this procedure (note that X will be a different set), getting a? with 


i 


iGA 


© (a? ai?) <2 9, (90 


for all AE &. 
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We apply this procedure n times, finally reaching a‘") with binary expansions of 


length zero, i.c., a =Oor 1. Let = 2a — LE {-1, +1}. Then for all AE %, 


not 
> (a ~a)|<2 5 


iG A j-0 


Dal"? ai 


iGA 


2 (,-p)|=2 


ZEA 


a-l 
<2>2" 1.9, (#) <29,(H) 
p70 
as required. Finally, a compactness argument implies the truth of Theorem 3.1 for 
arbitrary p,,.-.,p,E[-1, +h}. G 


Observe that all the upper bounds in Theorems 2.1, 2.2, 2.6, 2.7 are valid in 
fact for the hereditary discrepancy. 


The discrepancy of a matrix. The concept of discrepancy can be expressed in 
terms of the incidence matrix M of the hypergraph #: 


BH)= min ||Mell, 
eE{-t.+1}% 


and 


D(H) = max min = ||M(e— p)Il. - 
pel-l+ips ee(-t.45y5 
Note that these definitions are meaningful for any matrix M. Therefore, 
following Lovasz et al. (1986), we can use the notation 2(M) and 9,(M) for an 
arbitrary matrix M. We can also generalize the hereditary version by letting 
9,,(M) be the maximum of 9(M’) over all submatrices M’ of M. 
Almost all of the previous results, most notably Thcorems 2.2 and 2.8, extend 
to matrices in a natural way. The following slight generalization of Theorem 2.2 
also follows by the same argument. 


Theorem 3.2. Assume that every square submatrix of a matrix M has row with 
[,-norm at most 1. Then @(M) $2. 


The above generalized versions of the notion of discrepancy may become easier 
to grasp from the following nice geometric interpretation. Consider the set 


U, = {xER®: |[Ax|. <1), 


ie., the “unit ball” of the norm |{Ax||_. So U, is a convex polyhedron centrally 
symmetric with respect to the origin. For ¢ >0, consider the convex set t- U, and 
Ict U,(t), U,(0), ... be the copies of ¢- U, obtained by translating its center by all 
+1-vectors. Then 

© S(A) is the least number ¢ for which some U,(f) contains the origin; 

© %,(A) is the least number ¢ for which the sets U,(¢) cover the cube [~1, 1)°; 
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@ Y(A) is the least number ¢ for which the center of each face F of the cube 
{—1, 1} is contained in at least one of the sets U,(t) centered at the vertices of F. 

Theorem 1.5 raises the question whether in gencral the discrepancy of a 
hypergraph (or of a matrix) is related to the determinants of the submatrices of 
the incidence matrix. In this direction there is a lower bound theorem from 
Lovasz et al. (1986). 


Theorem 3.3. For any matrix A, 
GA) =max max [det B|'’* , 
where B ranges over all k x k submatrices of A. 


Let us think of the rows of matrix A as ordered by importance so that we may 
wish to make the discrepancy in carly rows extremely small, perhaps at the 
expense of the later E;. The following result states that there is an approximation 
which is extremely good with respect to the carly rows and is reasonably good 
with respect to all. 


Theorem 3.4 (Beck and Spencer 1984b, Spencer 1985). Let M =(m,,)ER”*" be 
a matrix with |m,|<1. Let p,,..., pyE[-1, +1]. 
(i) There exist €,,...,&,€{—1, +1} so that 


12 


n 

1/2 
2 m( pi &))<ci oy 
i= 


(ii) If the upper bound is relaxed to 2i then such &, are polynomial time 
computable. 


Note that (i) of Theorem 3.4 is best possible apart from constant factor (this 
again follows by considering Hadamard matrices). Part (ii) follows by applying 
Theorem 3.3 (whose proof, just like the proof of Theorem 2.2, can be followed by 
a polynomial time algorithm) to the matrix (7,,/). 

In the particular case m,, € {0,1} and p,; = 0 we obtain the following. 


Corollary 3.5. Let Y,, Y,, Y3,-.., Vy, be a sequence of subsets of a finite set X. 
(i) There exist a 2-coloring f: X—> {—1, +1} so that 


«Ts . 
<c-i'”? l<i:<M. 


’ 


> fee) 


x€Y, 


(ii) One can find in polynomial time a 2-coloring f : X—> {—1, +1} so that 


= fe) 


x€Y, 


<2i, (si=M. 


Theorem 3.4 has some nice applications in a matrix balancing problem (see 
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Beck and Spencer 1983, 1989). Let an arbitrary matrix A = (a,,), | <i, j<n, be 
given with all a, € {—1, +1}. By a row shift we mean the act of replacing, for a 
particular i, all coefficients a,, in the ith row by their negatives (—a,,). The column 
shift is defined similarly. A line shift means either a row or a column shift. 
Consider the following solitaire game. The player applies a succession of line 
shifts to the matrix A. His object is to make the absolute value of the sum of all 
the coefficients of A as small as possible. Let ||A|| denote this minimum value. 
Koml6s and Sulyok (1970), resolving a conjecture of L. Moser, showed that if 7 is 
sufficiently large then ||A|| <2 may be achieved (||A|| <1 if is odd). As an 
illustration, we shall derive this result from (ii) of Theorem 3.4 in the case of even 
n. 


Theorem 3.6. Let n = 2 be an even integer. Given any n Xn matrix A = (a,) with 
ail a, € {—1, +1}, there exist 5,,...,5,, &.---5& &{-l, +1} so that 


> no 


n 


Al] = » > 56a, 


Proof. It follows from (ii) of Theorem 3.4 that there exist column shifts €, so that 
the new row sums r, satisfy |r,|<2i, 1<i< XK. For simplicity of notation let us 
then apply row shifts so that all row sums are nonnegative. Since all r; are even 
integers we have r, =0, and, in general, 0<r,; <2i ~ 2. 

We now describe a simple technique that will give the final row shifts. Let 
5,,...,8, be nonnegative integers and Ict T be a positive integer such that s, = T 
and for 1<i<n—-—1, 


<2. 


Sj4p S58, t+ c++ +5,+T. 


Then there exist 6,,...,5, = +1, so that 
15,5, aes SF 5,5,(=T : 


We can find such 6 by reverse induction. Set 5,=+1. Having found 6,, 
5,.1,---,5;,, we choose 5,= +1 so as to minimize the absolute value of the 
partial sum 6,5, + °++ +6,,,5,;,, + 6s, We shall call this method the greedy 
technique for the remainder of the proof. 

We may not immediately apply the greedy technique because we may have too 
many r;=0 and thereby T large. Reorder the rows in increasing order of row 
sums. We then still have 0 <r, = 2i — 2. Suppose the first « rows have sum zcro 
and the next v rows have sum two. If u=1 we may simply apply the greedy 
technique so we shall assume u > 1. Let r; be the new absolute value of ith row 
sum after a single column is shifted. For the first u rows r; = 2 regardless of which 
column is shifted. For the next v rows r/ = 0 for (2/2) + 1 of the possible column 
shifts, these being the cases when an entry +1 switched to —1, and r; = 4 for the 
remaining (n/2) — 1 column shifts. Thus the average value of r;, taken over all n 
possible column shifts, is 2—(4/n). Now we conclude that the average value of 


, 


raptor try, is v(2—(4/n)). If v2=n/2 then the greedy technique trivially 
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works and hence we may assume v < n/2. Thus u(2 — (4/n)) > 2u — 2. Since this is 
the average, there must be one specific column change so that 

(1) ri,,t-- +4r1,, 22v. We also have 

(2) r,= ++: =r, =2, and , 

(3) O<r, <2i fori>utu. 

We observe that r; + --> +r), =2(u+v) and r; >2 for i>utu since r, 24. 
Hence 


roto tr 4222+22Fr 


i+t 


for all i=u tu. 
Trivially 


rote tr 422427 


é+#19 


when 1<i<u+v. Thus we may apply the greedy technique to the row sums 
ri5..+5!,, completing the proof. O 


Applying the stronger relation (i) of Theorem 3.3, one can prove the following 
general result (see Beck and Spencer 1989). 


Theorem 3.7. There exists a constant c>Q such that for every mxXn matrix 
A = (a,,) with all |a,,|<1 there exist 5 & {—1,1}” and e € {—1, 1}" such that 


DD dea, 
ij 


<c. 


4. Vector-sums 


We have seen a geometric interpretation of discrepancy problems in the row space 
of the corresponding matrix. Now we consider the space of the column vectors, 
which leads to several new and interesting questions. In fact the investigation of 
value-distributions of vector-sums developed earlier and independently of hy- 
pergraph coloring problems or of discrepancy theory. 

Let M=(v,,...,u,), v; ER” for 1<i<n. Let further ||-|| and ||-||' denote 
two arbitrary norms in R”. 


We define the discrepancy (relative to the two norms) by 


. n 
min,e 1 ayellBj=1 €,U;|| 


BM; |I-lL, Wl) = 


max, <;<nllv,|l' 


and 


DA Ill. I) =max BOM; Well, HID - 
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Note that for any matrix M and norm ||-||', 
9(M) = (M3 1., ’)-max fol 


The case when |j-|| was the /, norm also came up briefly in Theorem 2.8. 

Already in 1963 it was asked by Dworetzky what %(({-{], ||-{]) equals for a given 
norm. The more general question (where the two norms are not necessarily the 
same) was formulated first in Barany and Grinberg (1981), who gave the 
following general upper bound for Dworetzky’s problem. 


Theorem 4.1 (Barany and Grinberg 1981). For an arbitrary norm ||-|| in R”, 
Alf Ul) <m-. 


This is sharp when |]-{[ is the 4, norm. 

Now let us consider the special cases when ||+|| and |j-||* are one of the three 
most important norms: the /, norm, the /, norm or the /, norm. Theorem 2.1 has 
the following generalization in this setting. 


Theorem 4.2 (Spencer 1985). Bi, 1.) <6Vm. 


(Observe that the upper bounds in Theorems 4.1 and 4.2 depend only on the 
dimension!) Theorem 2.2 is also valid in this more general form. 


Theorem 4.3 (Beck and Fiala 1981). S(,, /,) <2. 


Grinberg observed, that for any M in R”, 
Bly, 1) < Vm. 


This is sharp. Indeed, consider m pairwise orthogonal unit vectors e,,..., @,, 
in R”. Then poe ,¢e,||,=—m'? for any choice of e € {—1, 1}". 

All but one of the remaining cases are trivial or easy consequences of the above 
ones. The only nontrivial case is when ||-|| = /.. and ||-||' = 4. In that case nothing 
nontrivial is known. The conjecture of Komlds refers to this case. 


Conjecture 4.4 (Komldés). There exists an absolute constant c such that 
Sl... l,)=c. 
The Komlés conjecture implies the Beck—Fiala conjecture 2.4 for set-systems. 
Partial sums. The problems we considered in the preceding paragraphs are of 


static character. The dynamic version is when we color the points one by one and 
we would like to have a “good” coloring at each stage. This formulation also 


a pee 
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allows us to study problems in which the underlying set S is infinite. The following 
theorems are of this ‘‘dynamic” character. 


Theorem 4.5 (Barany and Grinberg 1981). Let v,, v,,...,v, ben vectors in R” 
with |lv,|<1, where ||-|| is any norm in. R”. Then there exist a sequence 
£1,--.,&) & E{-1, +1) so that 


ZL 


by EU; 


=2m, fort=1,2,...,n. 
i) 


It is conjectured that if |[-{]| =/, or /,, then in this theorem, 27m can be replaced 
by KvVm. For 1, norm and if m =n, Spencer proved this conjecture. 


Theorem 4.6 (Spencer 1986). For any sequence v,,...,v,, of vectors in R” with 
\Jv,l|.. <1, there exists a sequence e,,...,€,, €;E{+1, -1) so that 


‘ 
bien 
aot 


An infinite-dimensional version of Theorem 4.5 is the following. 


=KVm_ fort=1,...,a. 


~ 


Theorem 4.7 (Beck 1990). Let v,, uj, v;,... be infinite-dimensional vectors 
satisfying |lv,||..<1. Then there exist €,, €), €,...3 €&€{—-1, +1} so that 


(> en), 


for all j and t. Here v, stands for the jth coordinate of the vector v. 


<jtto) 


Permutation of vectors. Instead of flipping the sign of vectors, we may achieve 
that all partial sums be small just by rearranging them. In fact, the two kinds of 


problems are strongly related as the following “transference lemma” of Chobayan 
shows. 


Theorem 4.8. Let v,,...,u, ER" with v, +u, +--+ +u, =0, and let |}-\] be an 
arbitrary norm in R”. Suppose that for every permutation w = (i,,i;,..+,%,) ©. 
{1,2,...,} there exist €,, &,...,€,€{—1, +1} (depending on a) such that 


t 
> ep, 


j=t 


max 


lesen 


<A. 


Then there is a permutation w* = (1,,1,,...,1,) of {1,2,...,} such that 


‘ 


Sy, 


yet 


max 


lsfea 


<A. 
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Proof. Let 
f 
B= min max vu, ||. 
mig. ig) Sten Wry 
We have to show that B< A. Let w*=(/,,1,,...,1,) denote the permutation 


where the minimum is attained. By the hypothesis of the: theorem, there exist 
ey,...,€, €{-1, +4} such that 


forall sin. 


Let 
M'={l<j<nie} =}, M = {1<j<nie? =—1}. 
We have 
Det de vy, =2 > Us 
jem! 
Isjsa 
and 
£ t 
2 i= De , = 2 > v,. 
j=l / j=l °1, jem- ? 
bsjut 
Hence 
A+B 
2 aa 
jemt 
syst 
and 
Dy 5 A+B 
jeEM 4 2 
lsj<s 


Setting M' = {p,<p)<---<p,} and M = {q,<q,< ++: <q,}, we define 
the permutation 


WP = (Dys Pays Pro Wo Ist I MN)» 
which we also denote by (h,,... ,A,,). It follows from the assumption v, + v, + -- 
+u, =0 that 


7 


a Un, 


fol 


A+B 
—— 
7 


max 


Ista 
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Since B was the minimum, we must have B <(A+ B)/2, and the desired 
inequality B= A follows. O 


Combining Theorems 4.5 and 4.7 with Chobanyan’s transference lemma, we 
get the following result. 


Corollary 4.9 (Bdrdny and Grinberg 1981). Letv,,...,v, ben vectors in R™ with 
\|v,|| <1 where ||-|| is any norm in R™. Assume that v, + v,+ +++ +, =0. Then 
there exists a permutation v, , U;,,...,0,, of the vectors v, such that 
t 

max |> v, |< 2m. 

i<r<n j= 4 J 
Corollary 4.10. Let vu,,...,v,, be infinite-dimensional vectors satisfying |lv;||..<1 
(1Sisn) and v,+v, +--+ +v,=0. Then there exists a permutation v, , 
U;,,---,U, Of the vectors v; such that 


<[tiy 


f 
max (> v,) 
belgua =) ? t 


j 


for ali 121. 


5. Arithmetic progressions 


A structure whose discrepancy properties have been extensively investigated is 
the family of arithmetic progressions. We have seen Roth’s theorem 1.2 and Van 
der Waerden’s theorem 1.3, showing the two sides of (qualitatively) the same 
phenomenon: If we focus on the short arithmetic progressions, we get a 
monochromatic one; if we focus on longer arithmetic progressions, a weaker 
preponderance phenomenon (large discrepancy) can be asserted. 

Van der Waerden’s theorem and related results on arithmetic progressions are 
discussed in chapter 25. Here we treat the ramifications of Roth’s theorem 1.2. 
Let us reformulate it in the language introduced above. Let #, denote the 
hypergraph formed by the arithmetic progressions in {1,...,2}. Then 


Theorem 5.1. B(%#,)>cn'". 


Proof. Let k = |Vn/6|. We show that the arithmetic progression can be chosen of 
length k and of difference at most 6k. Let us allow, however, also “wrapped” 
arithmetic progressions, i.e., subsets of {1,...,} that arise from an arithmetic 
progression of length k and difference at most 6k by reduction modulo zn. By the 
choice of k, every “‘wrapped” progression is the union of two “proper” arithmetic 
progressions, and hence it suffices to prove that if # is the hypergraph formed by 
“wrapped” arithmetic progressions, then # has discrepancy at least Aan Note 
that m = |3¢| = 6kn. 
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Let M be the incidence matrix of #. By Theorem 2.8(i), it suffices estimate 
Amint(M'M) from below. 
Now the matrix M'M is a circulant (this is where wiepine is needed!), and 
hence we know that its eigenvectors arc (1, €,€’,... ")', where € is an nth 
root of unity. The corresponding eigenvalues are 


Me) =— Noe? 


Aex 'jEA 


2 


Note that for each arithmetic progression A, there arc m -- | others (its translates) 
that give the same contribution. So we may just select arithmetic progressions 
starting at 0: 


~l 2 


eile 


By the pigeon hole principle we can find a d,, 1<d, =k such that 
—q/(3k) < arg(e“*) < 7/(3k) . 

Then Ree >! for 1<¢<k — 1, and hence 

k-1 


>> elo 


t=0 


2 k-1 2 ke 
= (Re ~ e) ae 


Me) = 


Thus 


1 1/2 k 1/2 
Rey ayer 2 
Ve 2 Nag Min “~\94 10 . 


Note that we have actually proved a stronger, /, norm version. This gives the 
following information about the difference d of the arithmetic progressions of 
large discrepancy. 


Corollary 5.2 (Roth). Given any 2-coloring f: N>{-1,+1} of the natural 
numbers, for infinitely many values of d, there is an arithmetic progression 
P= P(d) of difference d such that 


> fk)| >cVad. 


kEP(d) 


Roth conjectured that the exponent + of N in Theorem 1.2 can be improved to 
4 (which corresponds to the random 2-coloring). This was disproved by Sarkézy 
(1973). Beck (1981b) proved that Roths’s lower bound is nearly sharp, by a 
combinatorial argument based on Theorem 2.5. 
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Theorem 5.3.* F(%,)<c- n'*-(dogny’. 


Proof. For integers. satisfying i<j, let 
AP(a,d,i,j)={at+k-d:isk <j}, 


, AP(a, d,i, j) denotes the arithmetic progression with difference d, starting 
fh (a+i-d) and terminating at (a +j-d). We shall say that an arithmetic 
progression is special if it is of the type 


AP(b, d, i-2°, (i+ 1)-2"- 1), 


where d2=1, 1<b<d, i=0 and s>0. Let #}% denote the family of special 
arithmetic progressions contained in {1,2,...,}. By definition, 


AU AE H?: |A| > M}) = max |{A€ #*:|A|=M andk € A}| 


<max » by py 1. 


eae Pel! i<b<d : 
= a ee 
M—1 b=k (mod a) pee een 


Simple calculation shows that the innermost sum is at most c-log(n/(d-M)). It 
follows that 


A({AE H7:|A|>M)})<c-max > > tos 7-7) 


n-l i=b<d 
lads b=k (mod d) 


Now we apply Theorem 2.5 to #* with M=D= {(c,n)'’?]. Then we obtain 
BH *)<c,-n'*- (logn)’ . 
We claim that 
(H,,) < (2 log, n)- B(HZ). 


To see this, first observe that any arithmetic progression a, a+d,...,a@+/-d in 
[1, n] is representable in the form 


AP(b, d, 0, p,)\AP(b, 4,0, p,), 


where a=b+(p,+1)d, 1<b<d and p,=p,+1+I/. Moreover, both 
AP(b, d,0, p;) (6 =1,2) are disjoint unions of not more than log,n special 


* Very recently Matousek and Spencer (1994) cancelled the factor (log 1)’. The new idea is a clever 
application of a lemma of Haussler. See also Matousek (1994). 
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arithmetic progressions, i.c., elements of #*. Hence the “best” 2-coloring of #7 
gives a 2-coloring of #;, with discrepancy at most (2 log.) times as large. 


HW 


The following result is a sort of converse of Corollary 5.2. 


Theorem 5.4 (Beck and Spencer 1984a). Let n be a positive integer. Then there 
exists a 2-coloring f: N— {—1, +1} of the natural numbers such that for any 
arithmetic progression P = P(d) = {a,a+d,a+2d,...} of difference d <n and of 
arbitrary length, 


> fk) 


kEP(d) 


<c-+Vd-(logn)** (1<d<n). 


Unfortunately, we cannot prove that Theorem 5.4 is true with the right-hand 
side replaced by d“'’”)*°). As an upper bound depending only on the difference 
d of the progression, the weaker estimate d**%") immediately follows from 
Theorem 4.7. 

There is still no answer to the following old conjecture of P. Erdés (worth of 
= $500). 


Conjecture 5.5. For any f: N— {—1, +1} and for every constant C there are ad 
and n so that 


= Nk-d)|>C. 


In other words, the family of arithmetic progressions with first term 0 has 
‘unbounded discrepancy. 


6. Measure theoretic discrepancy 


We find the roots of discrepancy theory in number theory, in the theory of 
uniformly distributed sequences, and we give a brief introduction to this theory. 
(For the general theory of uniformly distributed sequences see the book of 
Kuipers and Niederreiter 1974.) 

The field originated with the celebrated paper of Weyl (1916), which was 
intended to furnish a deeper understanding of the results in diophantine 
approximation and to generalize some basic results in this field. At the beginning 
of this century, due to the work of Ostrowski, Hecke, Hardy, Littlewood, and 
others, it became clear that the approximability properties of an irrational a by 
rationals depends on the partial quotients (the “digits” a@,) in its continued 
fraction expansion 
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It became also clear that the approximability property of @ is closely related to 
the distribution of the sequence ({na}) in [0, 1). ({x} stands for the fractional 
part of the real number x.) , 

For every irrational a, the sequence ({na}) is everywhere dense in [0, 1). The 
fact that it is uniformly distributed expresses a stronger property. Let us give the 
definition for arbitrary dimension. 

Let w =(u,), n EN be a sequence in the k-dimensional unit cube [0, 1)". Let 
Bia, b) = Il... {a,,b,) be an aligned box in (0, 1)‘, and @, the family of all such 
boxes. Z(B; N) will denote the number of v with u, € B, <u <N. Let R((0, 1]*) 
denote the set of Ricmann-integrable functions on {0, 1]. 


Definition 6.1. The sequence w = (u,,), n E N is said to be uniformly distributed in 
(0, 1)* if for every aligned box BC fo, 1)* 


goed 
lim 7 Z(B; N) = 2(B) 


(here yw stands for the usual k-dimensional Lebesgue measure). Note that it 

would suffice to consider only boxes B(b)= B(0,6), since the characteristic 

function of every other box can be obtained by adding and subtracting the 
characteristic functions of at most 2” of these special boxes. 


Equivalent definitions are given by the following. 


Theorem 6.2. For a sequence (u,) in [0, 1)‘, the following are equivalent: 
(i) (u,) is uniformly distributed in [0, 1)*. 
(ii) For every f € R({0, 1]*), 


fim yD fla= [se ar. 


(iii) (Weyl’s criterion) For every integer point z EZ*\{0}, 


Condition (ii) indicates why uniformly distributed sequences are important in 
the theory of numerical integration. Observe that we obtain an equivalent 
condition if we assume that (ii) holds for a dense subset of R({0, 1]‘), and Weyl’s 
criterion is obtained by postulating (ii) for the functions ene (and consequently 
for all linear combinations of these). (This also suggests how the concept of 
uniformly distribution sequences can be generalized to topological groups.) 

As an illustration, we derive from Weyl’s criterion the following improvement 
on the classical Kronecker theorem. 


Corollary 6.3 (Weyl 1916). Suppose that a,,...,a, are real numbers such that 1, 
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a,,...,a, are linearly independent over the rationals. Then the sequence 
u, =({na,},...,{na,}), nEN 


is uniformly distributed in [0, 1)*. 


Proof. Let m=(m,,...,m,)€2Z*\{0}. Then 


N N 


2aim" a amy 
Dee =De 


aed ae 


where y =™m,a@, + m,a,+-+--+m,a,. Observe that y is irrational by the hypoth- 
esis, and hence ere 1. There. 


N 


>> ety = 


a-l 


Tiny 
rniy Fae 2 


=O(1). 


. < or 
l-e™” ji—e°™?} 
Thus Weyl’s criterion is satisfied. O 


It is easy fo see that the sequence w = (u,,), n EN is uniformly distributed in 
(0, 1)* iff 


sup |Z(B,N) —N- u(B)| = o(N). 
acjo,iy4 
aligned box 


But how small can o(4V) be? To handle this question, put 
,(B) = Z(N; B) ~ N|B\, 
y= sup |%,(B)|; 


Bc{o,1]* 
aligned box 


and 


lip 
a= (J ban(B(0, x1" dx) 
otk 
(Warning: this is not the pth power of B,,.) 


%, and M4, measure (in different norms) the discrepancy of the sequence 
u,,...,Uy, and their behavior for N— © measures the irregularity of the 
distribution of the infinite sequence (u,,). In the quantitative theory of uniform 
distribution, a central problem is the investigation of the order of magnitude of 
the discrepancy functions #4 and Dy. 

The quantitative theory started with the conjecture of Van der Corput (1935a), 
asserting that for an arbitrary sequence in (0, 1), sup,&, = %. This was proved by 
Van Aardenne-Ehrenfest (1945) who showed that for an arbitrary sequence (u,,) 


en OO rt oe a ON ec a a A mm 
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for infinitely many N, 
Dy, > c(log log N)(log log log N)~'. 
Roth (1954) strengthened this result. 


Theorem 6.4. (i) For an arbitrary infinite sequence (u,) in (0,1) and for every 
N>N,, 


max. DB? >c,(log NY? . 
san 


(ii) For N arbitrary points u,,...,uy in (0, 1]*, 
Dy >c, (log NY“? , 


(Here c,, c, ate positive constants depending only on k.) 

For k = 2 (Davenport 1956) and for k =3 (Roth 1979, 1980) it is proved that 
(apart from a multiplicative constant) these results are sharp. 

The problem of finding bounds for the discrepancy in the supremum norm is 
more difficult. Since %, 2%, the preceding results give some lower bounds on 
%,. For infinite sequences sharp results are known only for & =1, for finite 
sequences for k = 2; the latter is a reformulation of Theorem 1.1. 


Theorem 6.5 (Schmidt 1972). (i) For an arbitrary infinite sequence (u,) in (0, 1) 
and for every N 22, 


max %,>clogN. 


(<naNn 
(ii) For arbitrary N points U = {u,,...,u,} C[0,1)’, 
Dy >c"logNn. 


(Here c, c’ are positive absolute constants.) 

This result is best possible apart from the multiplicative constant. If u, = {na} 
where @ iS an irrational number of bounded partial quotients (@,<K, k= 
1,2,...), then for every N, Dy, <c, logN. Similarly, for the N points u,, = 
{{na},n/N} (1<=n<N) in [0, 1], By, <cylog N. 

There is a “transference principle” between sequences in [0,1)* and sets in 
(0, 1)‘*' (showing that parts (i) and (ii) in both Theorems 6.4 and 6.5 are 
equivalent). This is given by the following construction. 

(1) For a finite sequence u,,..., uy in [0, 1)‘, take the set 


~] 
{(u,. a Jew. itt st <n<n}. 


(2) Let v, €[0, 1)**', 1<n =<N be N points. Write v, =(u,, y,) where u, € 
{0, 1) and y, €{0, 1). Arrange the last coordinates y, 1<Sn€NQ in increasing 
order y, Sy, <++-<y, |. Take the sequence u,,...,4,, in (0, 1)‘. 


+ 5 
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In both cases the discrepancies are the same up to a universal constant factor. 
All known proofs of the fundamental Theorem 6.5 are rather hard. We sketch 
here a proof due to Halasz (1981). 


Proof of Theorem 6.5. We prove (ii). Given any x =(x,,x,) €[0, 1]’, let 
(x) =|UN BQ), 

and 
D(x) = B(x) — Nx x2. 


We shall construct an auxiliary function F(x) such that 


| , F(X) D(x) det >, log N , (6.6) 
[0,1]? 


and 


fs (FO dr <2, (6.7) 
oO, 7 
These yield 

Dy 2 max |B(x)| > $c, logn, 


and Theorem 6.5 follows. 
Any x €[0, 1] can be written uniquely in the binary form 


x= > B27", 
j=0 


where 8x) =0 or 1 and the sequence B(x) does not end with 1,1,1,.... For 
m=0,1,2,... let 
RQ) = (— 1a 


(Rademacher function). Let m = (m,,m,) be a pair of nonnegative integers. Let 
|[nel| =m, +m, and writing x =(x,,x,), let 


Ri) = Ray G1) Rin, 2) - 
By an m-box we mean a set of the form 

{7-20 (ay FY 20 YP X [ny 27, (nt 22]. 
For any m-box A let 


= R,,Q) fANU=9, 
ful) = \ P otherwise . 
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Let 2N <2" <4N, n integer. Let « =2 °°, and write 
F@)= I] (tef,Q))-1. 
m: \ln|l=a a 
Using the orthogonality of the modified Rademacher functions f,,(x), we have 


i lestar< | ( I] G+ ef,() + 1) dx 


[bn |} =a 
-| LT] (4 af,(@)) dx +1=14+1=2. 
10.1)? [mfj=n 


Note that 
atl 


F(x) = aF,(x) + D> a! F(x), 
where 


Fi= > f,0), 


[ere] 


and for j=2,...,n+1 


Fe Sn OO Sin) 
ei AL : 


It is not hard to prove that for every m satisfying ||m|| =n, we have 


i= f,(x)Z(x) dx = 0, (6.8) 
fy 5‘ Fn ()X 1X2 dx, dx, = (2"-— NJ, (6.9) 
and 
nv~jtl ne en ey 1-1 
fe F(x) D(x) ar|< Py py gota. Ge 5) (6.10) 


The proof of relations (6.8)—(6.10) is straightforward calculation. 
Now we arc able to complete the proof. By (6.10), we have 


n+t La-jtin 


da’ ap Fete) )| ia Py 2 i gra-i-4 wo) 


=2 k= 


S SS ye eis (12) 62 
= j-2 


<N-a? > 2" 4“ +a)! 


k=U1=1 
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f (I+ay\! 
2Nenvat- 2" * >) “) 


1 2 
<N-n-a?-2°" >. 
Combining this with (6.8) and (6.9), we obtain 
ntl 
|[ Fern ax| 2 | Fara) ax, dx,| — > F(x) (x) de 
j=2 


fo.1y? 
za(n + 1)NQ"—N)-2-°" 4 -Nen-a?- 27"? 
>2 ¥-n=clogn, 


as required. O 


As to the discrepancy in supremum norm, the following is a very difficult old 
problem. 


Conjecture 6.11. For all k =2 and for N arbitrary points in [0, 1}, 
Dy > c(k)(log Ny! 


This would mean that the exponent (k ~ 1)/2 implied by Theorem 6.4 (using 
Q, = D>.) is only half the truth. Note that the case k = 2 is settled by Theorem 
6.5. If true, Conjecture 6.6 is best possible by the Van der Corput—Halton— 
Hammersley sequence, see, e.g., Beck and Chen (1987). Recently Beck (1989a) 
improved on the old result of Roth by proving a 2-dimensional version of the 
Aardenne-Ehrenfest theorem, but Conjecture 6.11 appears still very difficult. 


Approximation of measures. One interpretation of Theorem 6.5 is that it is 
impossible to approximate the Lebesgue measure on the system of rectangles “‘too 
well” with a measure of finite support. There is a more general phenomenon in 
the background, as proved by Chen: the same is true for arbitrary measures. 


Theorem 6.12 (Chen 1984). Let g be a Lebesgue-integrable function in E’, and 
assume that g(x) #0 on a subset S C E’ with p(s) >0. Then there exists a constant 


c(g)>0 such that for every set U of N points in E? and for every function 
A:U>R, 


>c(g)log N . 


sup » Mu) — N as g(x) dae 


xEE? (UE BOQ)NU 


Rectangles in the N X N lattice. It is easy to see that Schmidt’s theorem 6.5 has 
the following corollary. Let the hypergraph %,, be defined on the underlying set 


Nee te eA IO he 2m A le A et A A AO CN ON (A I AOR, a 8 aan mat 
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S={0,1,...,N)}> by 
$y ={SO Ba, b)|0<asN,0SbH<N}. 


Obviously ACY,,) = 1. What can be said about ACF), ACL) or AyCLy)? Met 
follows easily from Theorem 6.5 that with a positive constant c >0 


By (Ly) > clog Nn. 
Hence by Theorem 3.1, 
Dy(Fy) = $ D(Ly) = SD (Ly) >clogn. 


A related problem concerning balanced 2-colorings of finite sets in the plane 
was formulated by G. Tusnady. Let Y be an N-element point set on the plane. 
Let T = T(P) be the least integer ¢ such that one can assign +1s to the points of P 
so that the sum of these values in any rectangle with sides parallel to the 
coordinate axes has absolute value at most T. Now Tusnady’s problem is to 
determine 


Ty a TP). 


The following theorem gives the best known bounds; the lower bound is due to 


Beck (1981a), the upper is a recent result of Bohus (1990), improving a result of 
Beck. 


Theorem 6.13. For N =2, 
c, log N<Ty<c,-(logN)*. 


Proof. We give the proof of Beck's upper bound of (log N)‘ as an application of 
Theorem 2.2. It suffices to prove the following. Let A = (a,,), where a,,=0 or 1, 
be a matrix of size N x N. Then there exist “signs” ¢, € {—1, +1} such that 


n m 
ey pb £545 
f=1 j=l 


for al ln, m<N. 

We now prove (6.14). Adding a few 0-rows and 0-columns if necessary, we may 
assume that N=2' where / is an integer. For every pair (p,q) of integers 
satisfying 0 <p, q <1, we partition A into 2°** submatrices, splitting the horizon- 
tal side of the matrix into 2” equal pieces and the vertical side of the matrix into 
2* equal pieces. There are (/ + 1)’ = (log N)’ partitions. Let us call a submatrix of 
A special if it occurs in one of these partitions. Let S = {(i, j): a, = 1}, and let us 
associate with every submatrix B the subset 


<c(log N)* (6.14) 


Y, = (i, J): a, belongs to B, a, = 1} . 
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Let 
# = {Y,: B is a special submatrix of A} . 


Sincc the maximum degree A(#)<(/ +1)’, by Theorem 2.2 there exists an 
assignment of +1s so that the absolute value of the sum of the signed entries in 
each of the special submatrices is less than 24(#) < 2(1 + 1)’. Note, however, that 
any submatrix of A containing the lower corner A is the union of at most I’ 
disjoint special submatrices. Thus (6.14) follows. 

The proof of the lower bound depends on Theorems 3.1 and 6.5. We may 
clearly assume that N= nin integer. We need the following reformulation of 
Theorem 6.5: 

Let P be an arbitrary finite set in the square {0, y)’, y>1. There exists an 
aligned rectangle AC [0, y)’ such that 


||P Al ~ 2(A)|>c-log y. 
We shall use that for any convex set A C[0,n)’, we have 
|ANZ’| = p(A) + O(n). 


Let S=[0,nY NZ’, H={SNA: AC[O,n) aligned rectangle} and q =1-2:- 
n'. Let e(s)€ {~1, +1} (s ES) be fixed such that 


4 


D(H; q)=max | 2 (es) ~9)} 


and let S = {s ES: e(s) = —1}. Then we have 


1 
OH; q) = 2 max Is nal-S lanl 


> 2max Is” A Al - 2 wA)| +O(1). 


Apply a contraction of linear ratio n '?:4, and apply the reformulation of 


Schmidt's theorem given above to the resulting set. We obtain that 
B(H; q) = 2c -log(n''”) — O(1) 2c log N . 

Thus by Theorem 3.1, 
D(H) = 4 H(H) =} DH; gq) 2c, -logn. 

In other words, 
D(H) =c,-logN 

for some Z CS. Since |Z| <|S| =n? = N, we have 


Ty 2 Tz) 2 D(H,) 2 cy logn. a 


catia cabirane 
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Let X = ((i, fj): a, = 1}, # be the family of submatrices of A containing the 
lower left corner of A, G be the family of special submatrices, N = 2', D= (i+ 
1)’, K =’. Applying Theorem 2.7, we obtain the following modest improvement 
on (6.14): 


a om 


De £4 
i-1jr-l 


for all l <n, m<N. 

In higher dimensions, however, the improvement given by Theorem 2.7 
becomes significant. Repeating the proof of 6.13 in higher dimensions, we get the 
following result: 


Let A=(a,) ME {1,...,N}*--- x {1,...,N}) be a k-dimensional matrix 
with entries a, =0 or 1. Then there exist “signs” ¢, © {-1, +1} such that 


Say. 


ADnsm 


<c(e)- (log NY?" 


<c(k)- (log N)* 


for all m=(m,,m,,...,m,) satisfying 1=<m,,...,m,<N. Applying Theorem 
2.7 we obtain, however, the following better bound: 


py €,a, 


nins=m 


<c(k, €): (log NAIC", 


for all m as above. We have strong indications that the true order of magnitude is 
probably about (log N)*"'. 

In contrast to the case of Theorem 2.2, when the proof gives a polynomially 
computable algorithm to construct the desired signs «,= +1, the proofs of 
Theorems 2.5 and 2.7 imply only the existence of balanced 2-colorings. 


7. Geometric structures 


In this section we discuss a varicty of questions where the underlying set S is 
either the k-dimensional unit cube [0, 1]* or (in the discrete version) the N x N x 
-++ XN lattice. 

We study generalizations of the classical problem considered in Theorem 1.1. 
We no longer restrict ourselves to the boxes: we allow rotation, and we also study 
more general shapes. Many problems of this type originated with the paper of 
Erdos (1964). 

Let # be a family of simple geometric objects, as aligned or tilted rectangles, 
triangles, balls, etc., in R*. Let PC[0,1]* be a set of N points. Set 


D(A) = inf sup JAN P| — Nu(A nfo, 1)*)}. 
IPI-N sca 


We consider first the case of aligned right triangles. 
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Theorem 7.1 (Schmidt 1969). Let P,,..., Py, be N points in the unit square 
[0,1]?. Then there exists a right triangle T C{0,1)° with two sides parallel to the 
coordinate axes, and with 


IPA T|-N- pry >NOO™. 


Beck (1984a, 1987a) slightly improved the lower bound and also proved that 
the lower bound is nearly sharp. 


Theorem 7.1’. Let & be the family of right triangles in the plane with two sides 
parallel to coordinate axes. Then 


CNY <Di(a)<N'VilogN . 


This theorem exhibits a rather paradoxical phenomenon. Let T’ be a right 
triangle. There is a unique right triangle T” such that 7’U7™ is an aligned 
rectangle A. We know that there exist N-element sets P with 


|PN A] - N- p(A)|<c-log N 


for all aligned rectangles A C[0, 1)”. This set contains almost the “right” number 
of points in T’ U T” but - by Theorem 7.1 —- must be quite irregularly distributed 
in the two halves T’ and 7”. 

Essentially the same proof gives the following 2-coloring result. Let f be a 
2-coloring of the N x N square lattice. Then there exists an aligned right triangle 
T such that the difference between the number of red points and the number of 
blue points in T is at least c-N'’*. In other words, the corresponding hypergraph 
has discrepancy at least c-N'’’. 

(Note that the analogous question for aligned rectangles is trivial. The chess- 
board type 2-coloring of N x N has deviation at most 1 for any aligned rectangle.) 

Consider next the family of balls. Again we have a ‘large discrepancy” result 
(for the pioneering result, see Schmidt 1969). 


Theorem 7.2 (Beck 1987a). Let of be the family of balls contained in {0, 1)*. Then 
7) (ff) > N121P2Kme 
IN ; 
The following result states, roughly speaking, that for rotation invariant 


families the discrepancy is always ‘‘large’’. 


Theorem 7.3 (Beck 1987a). Let A C{0,1)* be a k-dimensional convex body, and 
let & be the family of convex sets obtained from A by a similarity transformation 
(rotation, translation, and homothetic transformation). Then 


D(A) > (A, ent etek : 


We remark that Theorems 7.2-7.3 are nearly best possible (see Beck 1984a). 
The situation is more complicated if rotation is forbidden (as the difference 
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between aligned right triangles and rectangles indicates). The discrepancy of the 
family of homothetic copies of a given convex shape A depends mainly on the 
smoothness of the boundary A. We have a fairly good understanding of this 
phenomenon (for more details, see Beck 1988 and Beck and Chen 1987). 

For the discrepancy of congruent sets, see Beck (1987b). For the discrepancy of 
half-plances, see Beck (1983), Alexander (1990) and Matousek (1994). 

It is worthwhile to mention here that all of these theorems are essentially 
independent of the shape of the underlying set — instead of the unit cube one can 
consider the unit ball, the regular simplex, any “reasonable” convex body, the 
surface of the unit sphere, etc. 


An application in discrete geometry. For which set of N points on the unit sphere 
is the sum of all (7’) euclidean distances between these points maximal, and what 
is the maximum? Let $* denote the surface of the unit sphere in R‘''. Let P be a 
set of N points on S*. Let |x| denote the usual euclidean length. We define 


LIN, k, P)= > |p-a| 


P.GeP 
and 


L(N, K) = max L(N,k, P), 


where the maximum is taken over all PCS‘, |P|=N. The determination of 
L(N, k) is a long-standing open problem in discrete geometry. For k = 1, the 
solution is given by the regular N-gon. It is also known that for N= +2, the 
regular simplex is optimal. For N > k + 2 and k 22, the exact value of L(N, k) is 
unknown. The reason for this is that if N is sufficiently large compared to k, then 
there are no “regular” configurations on the sphere, so the extremal point 
system(s) is (are), as expected, quite complicated and ‘“‘ad hoc”. 

Since the determination of L(N,k) seems to be hopeless, it is natural to 
compare the discrete sum L(N, k, P) with the following integral (the solution of 
the “continuous relaxation” of the distance problem) 


N° te | d = k 2 
5) “a(S*) a |P > Pal a(P) = ck) N°, 


where o denotes the surface area, do(P) represents an element of the surface 
area on S“, p,=(1,0,0,...,0)ER**'. The constants c,(k) can be calculated 
explicitly; e.g., cg(1) = 2/1, cy(2) = 4). Stolarsky (1973) has discovered a beauti- 
ful identity saying, roughly speaking, that the discrete sum L(N,k, P), plus a 
measure of how far the set P deviates from uniform distribution, is constant. Thus 
the sum of distance is maximized by a well-distributed set of points. Combining 
Stolarsky’s identity with a result in “irregularities of distribution’”’, one can obtain 
some nontrivial information on the order of magnitude of L(N,k) (see Beck 
1984b). 
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Theorem 7.4. L(N, k) = cy(k)» N° + O(N! "?). 


Finally, we mention the famous Heilbronn’s triangle problem which is, in a 
broader sense, related to our topic (see Roth 1976 and Komlds et al. 1982). 


8. Uniform distribution and ergodic theory 


The most important class of uniformly distribution sequences in [0, 1) is the class 
of sequences ({na}) for @ irrational. These are the basic sequences in the theory 
of diophantine approximation. Further, these are the best ‘‘test-sequences”’: very 
often theorems which were found first for sequences ({na}) turned out to be true 
for more general ones. Finally we mention the relation of sequences ({na}) to 
topological transformations. 

The discrepancy of ((na@}) depends on the partial quotients a,, k =1,2,... of 
a. For every N and x &[0, 1) there is an “explicit” formula for the discrepancy 
®,({0, x)) defined in section 6 (Sdés 1974). This leads to the following bounds. 


Theorem 8.1. Let p,/q, be the kth convergent of a: py/qy={a,.---+44-4]- Uf 
gq SN <q,4, then 


k k+l 
cy > a, < max, Dy <c, 2 a; . 
i7 7 


Consequently, if a, = K,i=1,...,then 
D,<c-K-logn. 
Much is known about the finer properties of the distribution. Though 


max sup 9,(/)>clogN , 
ri 


lensNn 


there arc intervals / in which the distribution is very good. 


Theorem 8.2 (Hecke—Kesten). For the sequence ({na}) and for a fixed interval I, 


the discrepancy D,(1) remains bounded if and only if p(1) = {ka} for some integer 
k. 


The “if”? part was proved by Hecke (1922) and the much deeper “only if’’ part 
by Kesten (1966). Very elegant proofs and generalizations of this theorem in the 
framework of ergodic theory are due to Fiirstenberg et al. (1973), Halasz (1976), 
Petersen (1973). 

On the other hand it is remarkable that this theorem and further properties of 
,, are relevant in ergodic theory (see, e.g., Herman 1976, Deligne 1975). 

Schmidt investigated the analogous question for arbitrary sequences in [0, 1). 
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Theorem 8.3 (Schmidt 1974). For an arbitrary sequence (u,,) in [0, 1) the lengths 
of all intervals I for which 2,(1) remains bounded form a countable set. 


The ergodic theoretical generalization shows the essence of Kesten’s theorem. 

Let (Q,., w) be a probability space, T:Q—> an ergodic transformation (a 
measure preserving transformation such that t every T-invariant measurable set has 
measure 0 or 1). For AE, x ew let Zi(A;x) denote the number of points 
T"xE A, 1Sn=N. Set - 


DA; x) = |Z (As x) ~~ Nw(A)l . 
By Birkhoff’s ergodic theorem, for every fixed A € &, for almost all x E 2, 


1 
Hy OMA; x) 0 if No, 


The uniformity or irregularity of the distribution of the orbit is measured by the 
sequence 21(A;x). Furstenberg et al. (1973), Petersen (1973), Haldsz (1976) 
proved the following very sane generalization of Kesten’s theorem. 


Theorem 8.4. /f for some AE oh: Q (A; x) is bounded on a set X C 2 of positive 


measure, then e°™"" is an eigenvalue of T; that is, there exists a function g#0 
such that 


g(Tx) =e" Yon) forxE ND. 


Conversely, for every eigenvalue e*" there exists an AG A such that p(A)=p 
y elg 


and 2}(A;x) remains bounded as N->~& for almost all x € 2. 


Remark. Kesten’s theorem follows from Theorem 8.4. To see this, let Q =R/Z. 
Let R,:x—>x + @ (mod 1) be the rotation by 27a). The eigenvalues of R, are the 
muTibers e2"(ka}. hence Kesten’s theorem follows. 


We give another example of the relationship between uniform distribution and 
ergodic theory, illustrating how results on distribution of the sequences ({na}) 
imply general results on homeomorphisms of the circle. 

Denjoy (1932) proved that for every homeomorphism 7: R/Z— R/Z having 
no periodic point there exists an irrational a(T) € (0, 1) such that T is conjugate 
to the rotation R,. By this result, the distribution of T’x, n=1,2.,.... is 
determined by the ‘distribution of the sequence ({na}). In particular, 

(a) By Birkhoff's ergodic theorem the discrepancy 9)(/; x) = 0(N). By Den- 
joy’s theorem we know much more: /(I;x) is the same as the corresponding 
discrepancy of the sequence ({na(T)}). 

(b) The order of points {na}, 1<n<WN is very restricted: if ma is the 
permutation determined by {7(l)a}<---<{a(N)a}, then, for example, for 
every fixed a and N the difference a(i) — w(i— 1) takes at most three different 
values. Now, by Denjoy’s theorem the same holds for an arbitrary homeomorph- 
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ism 7 having no pevoule point and every point x, if we define the permutation 7 
by T(x) << T7 OZ) << T™, (See S6s 1957, Swierczkowski 1958.) 

One of the most fascinating and deepest relationships between combinatorics 
and ergodic theory is given by Fiirstenberg. Since there is a recent expository 
paper by Furstenberg ct al. (1982), and the book of Flirstenberg (1981), we do 
not go into the discussion of this. We mention only the fascinating recent result of 
Firstenberg and Katznelson (1989) on the density version of the Hales—Jewitt 
theorem (see chapter 25). 


9. More versions of discrepancy 


Strong irregularity. In (0,1) the following “strong irregularity” phenomenon 
holds. 


Theorem 9.1. (i) For every « >0 there exists a § >0 (depending only on £) such 
that for an arbitrary sequence (u,,) in (0,1) and for every N>0, 2, >6 logn for 
all but at most N‘ values of n=N. 

(ii) For every K >0 there exists an M >0 (depending only on K) such that for 
an arbitrary sequence (u,,) in (0, 1) and for every N>0, &, > K for ail but at most 
(log N)” values of n<N. 

(iii) For an arbitrary sequence (u,,) in (0,1) the set of values of x for which 
QB, {[0, x)) = o(log N) holds, has Hausdorff dimension 0. 


This theorem was proved first only for ({2a}) sequences (Sés 1979, 1983a), 
then for arbitrary sequences and in a more general form by Haldsz (1981) and 
Tijdeman and Wagner (1980). 


One-sided irregularities. Measuring the irregularities with 9, or D&, we do not 
have any information on the sign of the discrepancy. Therefore we define 

PD x({O, B)) = max{Py([O. B)), 0} . 
and 


Dy Tp @ x({0, B)). 


vy is defined analogously. 

One-sided discrepancies show some new phenomena. Again, the first results on 
one-sided irregularities were found for ({na}) sequences. For example, there is 
no one-sided strong irregularity phenomenon. We mention just the simplest 
illustration of this. It is easy to see that 


supDy =~”, supDy=-., 
N N 


But no explicit lower bound can be given: for an arbitrary sequence M, — © there 
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exists an a such that 4, <M, and also there exists an @ such that %, <M,,, if 
N is large enough. 

Similarly, it is easy to see that the sequence of indices N with 97 <K has 
density 0. However, for an arbitrary sequence M, = o(N), there exist an a anda 
K such that & mf  K holds for at most M, values of a << N, if N is large cnough 
(Sé6s 1983a). 

Concerning intervals of small discrepancy, first we remark that 9 7,((0, 8)) may 
be bounded even in the case when _B ~ {ka}, i.e. when 9, ([0, B)) is not. 

In Dupain and Sods (1978) those intervals {0, 8) are investigated for which 
% ,([0, B)) is bounded. Here we mention just one of the new phenomena: there 
exists an a for which the set {8: sup,y,({0, B))<} has the cardinality 
continuum. 

As an example in the opposite direction, the assertion in Theorem 8.2 remains 
true if instead of the boundedness of 2,(A) we assume only one-sided bounded- 
ness. Halasz (1976) proved that if : 


sup DR (A; x) <2 
N 


holds on a set X C 2 of positive measure, then e”"““) must be an eigenvalue of 
T. 

In contrast to aligned boxes, for balls even the simplest results: supy@ y = ™, 
sup,D , = © are nontrivial. The proof of these, that is, a one-sided version of 
Theorem 7.2, can be found in Beck (1989b). 

The following problem of Erdos, which was recently solved, is essentially a 
one-sided discrepancy problem. 

Let €,, &, 3, ... be an arbitrary infinite sequence of complex numbers on the 
unit circle |z| = 1. For every » EN and complex z, let 


P,@)=T@-&). 


Further, let 


A, = ACE, &2,--- + €,) =max |P,(2)] 


Erdos conjectured that for every fixed sequence (é,), limsup A, =, and 
asked about the correct order of magnitude of 


max A, asN—>o, 
f<n<N 
Observe that if the points é,,...,&, are just the mth roots of unity, then 
P,(z) =z" —1, and so A, =2. This shows that the relation lim sup A, = must 
be a consequence of the impossibility of getting every segment &,,... , €, close to 
uniform distribution. There seems, therefore, to be an intimate connection with 
the Van der Corput problem (see section 6). 
By realizing this heuristics, Wagner (1980) proved the conjecture lim sup A, = 


1442 J. Beck and V.T. Sos 


oo, He developed a variation of Schmidt’s original proof of Theorem 6.5, and 
actually proved the estimate 


imax A, > (log N)°. 


Recently, Beck (1991a) managed to prove the best possible result 


max A, >N‘, 


i<n=N 


by developing a version of Haldsz’s proof of Theorem 6.5. 


10. Epilogue 


As we mentioned already in the introduction, discrepancy theory has its roots, as 
well as its applications, in many different areas. Here we mention just a few 
recent applications of discrepancy and uniform distribution. 


Squaring the circle. Tarski raised the following question, which is sometimes 
called “the problem of squaring the circle’’ (misusing the name of an ancient 
problem): is a disc equidecomposable to a square? In other words, can a disc be 
decomposed into finitely many parts, which can be arranged to obtain a partition 
of a square? The answer is in the negative under various restrictions, e.g., if the 
pieces are restricted to be Jordan domains. 

Recently Laczkovich (1990) gave a striking and ingenious construction which 
answers Tarski’s question in the affirmative. The proof is based on a sufficient 
condition for the equidecomposability of two bounded measurable sets in terms of 
the discrepancy of certain special sequences. 


Computing the volume. Uniformly distributed sequences are used generally in 
applications of Monte Carlo methods. A recent success in this area is the 
computation of the volume of an n-dimensional convex body in polynomial time 
by Dyer et al. (1989). The basic tool is that a uniformly distributed point in the 
body can be generated efficiently (using random walk on a grid). It is a surprising 
fact that in this problem deterministic uniformly distributed sequences cannot give 
a good approximation in polynomial time (see Elekes 1986, Barany and Fiiredi 
1987). 


Drawing segments on screen. Luby (1986) studied the question of drawing 
segments on a screen as paths in a grid. He showed that if certain natural 
assumptions are made, every scheme to assign a ‘connecting segment” to every 
pair of points will necessarily use “bent” segments. The amount of deviation from 
the straight line is determined by Schmidt’s theorem 1.1. 


“Gray” areas in photography. R6dl and Winkler (1990) studied the question of 
representing a gray area as a combination of black and white dots. Modelling the 
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“smoothness” of the resulting color as a discrepancy problem, he showed that the 
measure of this “smoothness” can be estimated by the theorem of Beck and ‘its 
improvement by Bohus (Theorem 6.13). 
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Note. While this chapter contains a substantial amount of material on infinite 
graphs, its focus is on finite graphs. Therefore all graphs will be finite, unless oth- 
erwise stated. Exceptions are sections 3.6, 3.7, and 3.11, where graphs are generally 
infinite, and sections 3.9, 3.10, 5.3, where a main theme is the interplay between 
finite and infinite. 


Surveys. A portion of the material discussed in this chapter is covered in two 
survey articles on automorphism groups of graphs: Cameron (1983) and Babai 
and Goodman (1993). Chapter 12 of Lovasz (1979a) is a nice introduction to the 
subject. A beautiful treatment of the basics of higher symmetry is Biggs (1974). 
Brouwer et al. (1989) is a monumental yet enjoyable work on distance-transitivity 
and related subjects. Much of our current knowledge on graph isomorphism test- 
ing is summarized in Babai and Luks (1983). The general concept of recon- 
struction (invertibility of various constructions) is illustrated in chapter 15 of 
Lovasz (1979a). Recent surveys on the Kelly-Ulam graph reconstruction conjec- 


ture include Bondy (1991), Ellingham (1988) (see also Bondy and Hemminger 
1977). 


0. Introduction 


Ol. Graphs and groups 


A study of graphs as geometric objects necessarily involves the study of their 
symmetries, described by the group of automorphisms. Indeed, there has been 
significant interaction between abstract group theory and the theory of graph au- 
tomorphisms, leading to the construction of graphs with remarkable properties 
as well as to a better understanding and occasionally a construction or proof of 
nonexistence of certain finite simple groups. On the other hand, in contrast to clas- 
sical geometries, most finite graphs have no automorphisms other than the identity 
(asymmetric graphs), a fact that is largely and somewhat paradoxically responsible 
for its seeming opposite: every (finite) group is isomorphic to the automorphism 
group of a (finite) graph. 

The study of graphs via their symmetries is rooted in the classical paradigm, 
stated in Felix Klein’s “Erlanger Programm”, that geometries are to be viewed as 
domains of a group action. Although graphs, as incidence structures, may seem 
to be degenerate geometries, we note that any incidence structure (such as a pro- 
jective plane) can be represented by a graph. (The Levi graph L of an incidence 
structure S is a bipartite graph; its vertices correspond to the points and lines 
of S; and adjacency of vertices of L corresponds to point—line incidence in S.) 
Such representations preserve symmetry and allow fruitful generalizations (such 
as “generalized polygons”, chapter 13, section 7). 

In this chapter we try to illustrate the variety of ways in which groups and graphs 
interact. The effect of powerful results of group theory (such as the Feit-Thompson 
theorem on the solvability of groups of odd order) will be evident already in the 
introductory section 1.1. Consequences of the classification of finite simple groups 


1450 L. Babai 


(CFSG) are required for some of the results in section 4.3 and for the analysis of 
some of the algorithms in sections 6.6 and 6.7. Many of the results surveyed in 
section 5 critically depend on the CFSG. On the other hand, some results of graph 
theoretic nature have played a role in the classification theory itself, as illustrated 
in sections 3.5 and 5.1. 

In spite of these connections, the treatment of the subject will mostly be kept on 
an elementary level, requiring little more than basic group theory. The main theme 
of section 3 is the surprisingly strong effect of modest symmetry assumptions on 
the combinatorial parameters of a graph. 

We try also to illustrate some of the links of the subject to areas not imme- 
diately seen to relate to groups. Sections 1.6 and 7.2 illustrate this point within 
combinatorics. Several connections to topology are explored in section 3 (see esp. 
sections 3.6 and 3.7). Random walks feature in section 3.8; linear algebra is visited 
briefly in sections 1.5, 3.8, 3.12 and 7.2. 

Strong links have been forged to model theory (section 5.3) and to the theory 
of algorithms (Sections 6.6 and 6.7). Some of the remote sources of motivation 
include algebraic topology (section 3.11), differential geometry (sections 3.6 and 
3.7), and even quantum mechanics (section 1.5). 


0.2. lsomorphisms, categories, reconstruction 


lsomorphisms of graphs are bijections of the vertex sets preserving adjacency as 
well as non-adjacency. In the case of directed graphs, orientation must be pre- 
served; in the case of graphs with colored edges and/or vertices, we agree that 
colors, too, must be preserved. Similar definitions apply to hypergraphs. In the 
case of incidence structures consisting of “points” and “lines”, linked by incidence 
relations, we think of an isomorphism as a pair of bijections (one between the 
points, another between the lines), so that the pair preserves incidence. This view 
should be applied to graphs as well if multiple edges are allowed. 

Automorphisms of the graph X = (V,E) are X — X isomorphisms; they form 
the subgroup Aut(X) of the symmetric group Sym(V). ee a of directed 
graphs, etc., are defined analogously. 

The questions of reconstruction are, broadly speaking, questions - invertibility 
of certain isomorphism preserving operations on structures. A category in which 
all morphisms are isomorphisms is called a Brandt groupoid. Let €, @ be two 
Brandt groupoids and F:€ — & a functor. Hence X = Y implies F(X) = F(Y). 
We call F weakly reconstructible if the converse also holds: F(X) & F(Y) implies 
X = Y. We say that F is strongly reconstructible if for every pair X,Y of objects 
of €, F induces a bijection between the sets Iso(X, Y) and Iso(F(X), F(Y)) of 
isomorphisms. In this case, Aut(X) = Aut(F(X)) for every object X. We also say 
that, within the class €, the object X is (weakly, strongly) reconstructible from 
F(X). 

A classical example is the reconstructibility of a multiset of direct irreducible 
finite groups from their direct product (unique direct factorization, R. Remak and 
O. Yu. Schmidt, cf. Baer 1947). The category € consists of the multisets of direct 
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irreducible finite groups with the natural notion of isomorphism. Let F associate 
the direct product of the members of such a multiset X with X. This functor 
is weakly but not strongly reconstructible. (To see the latter, consider the pair 
{Zp, Zp}-) 

Homomorphisms of graphs are defined as adjacency preserving maps, i-e., a 
map f:V, — V) is a-homomorphism of the graph X, = (Vi, £)) to the graph 
X_ = (V2, Ep) if (f(x), f(y)) € Ey whenever (x,y) € Ej. It is not required that non- 
adjacency be preserved; therefore a bijective homomorphism is not necessarily an 
isomorphism. It is easy to see that the chromatic number of the graph X is the 
smallest (cardinal) number m such that the set Hom(X, K,,,) of X — K,,, homomor- 
phisms is nonempty. The set End(X) = Hom(X, X) forms a monoid (semigroup 
with identity) under composition: the endomorphism monoid of X. Aut(X) consists 
of the invertible elements of End(X). The class of graphs together with the homo- 
morphisms forms a category. These concepts extend naturally to directed graphs 
(orientation of edges must be preserved), graphs with colored vertices and/or 
edges (homomorphisms preserve color by definition); and to general relational 
structures involving relations of arbitrary arities. 

The interconnections of these areas are manifold. The algorithmic problem of 
deciding whether or not two given graphs are isomorphic is equivalent to de- 
termining the automorphism group, and specific automorphism information for 
certain classes of graphs made it possible to use group theory to surprising depth 
in the analysis of graph isomorphism algorithms. Isomorphism rejection tools in- 
clude graph invariants, i.e., functions F such that X = Y implies F(X) = F(Y). 
The construction of combinatorial, algebraic, and topological structures with pre- 
scribed automorphism groups and endomorphism monoids usually amounts to con- 
structing strongly reconstructible functors. Reconstruction itself is an isomorphism 
problem, and automorphism groups have played a role in its study. Finally, es- 
tablishing reconstructibility of certain functors is a useful tool in determining the 
automorphism groups of certain derived structures. 


1. Definitions, examples 


In this section, we collect some illustrative facts about automorphism groups of 
graphs and their interplay with reconstruction type problems. 

We start with the simplest examples. A graph and its complement have the same 
automorphisms. The automorphism group of the complete graph K,, and the empty 
graph K,, is the symmetric group S,, and these are the only graphs with doubly 
transitive automorphism groups. The automorphism group of the cycle of length 
nis the dihedral group D, (of order 2n); that of the directed cycle of length n 
is the cyclic group Z,, (of order n). A path of length 21 has two automorphisms. 
The automorphism group of a graph is determined by the automorphism groups 
and the isomorphisms of its connected components: if X\,...,X, are pairwise 
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nonisomorphic connected graphs, and X is the disjoint union of m; copies of X;, 
i=1,...,k, then 


Aut(X) = Aut(X1) 0S, X <x Aut(X,) 0 Sin,- (1) 


The wreath products occurring here realize their imprimitive action (cf. chapter 
12). 


1.1. Measures of symmetry 


A graph is vertex-transitive if its automorphism group acts transitively on the set 
of vertices. Such a graph is necessarily regular; the union of a 3-cycle and a 4-cycle 
show that the converse does not hold. If the group acts transitively on edges, the 
graph is edge-transitive. A vettex-transitive graph need not be edge-transitive. (Ex- 
ample: triangular prism.) If X is an edge-transitive graph without isolated vertices, 
and_ X is not vertex-transitive, then it must be bipartite, with the group acting tran- 
sitively on each color class. The complete bipartite graphs K,,,, with m 4 n show 
that this can indeed: happen. Regular graphs with edge- but not vertex-transitive 
automorphism groups are not so easy to construct (cf. Folkmann 1967, Bouwer 
1969, 1972, Titov 1975, Klin 1981). 

A flag in a graph X is an ordered pair (v,e) where v is a vertex and e is an 
edge incident with v. If Aut(X) is transitive on flags then X is flag-transitive. 
This means transitivity on the set of ordered pairs of adjacent vertices. For graphs 
without isolated vertices, flag-transitivity implies both vertex- and edge-transitivity. 
Again, the converse is false (cf. Holt 1981, Cameron 1983). If, however, X has odd 
degree, then vertex- and edge-transitivity imply flag-transitivity. 

A graph X is vertex-primitive if Aut X ts a primitive group. Vertex-primitivity by 
definition implies vertex-transitivity, but it does not imply edge-transitivity. (Take 
a cycle of prime length p > 7 and add all chords of length 2). For a graph X, let 
X denote the graph obtained by joining a pair of vertices of X if their distance 
in X is t. If X is vertex-primitive and not empty then X is connected for every 
t < diam(X). In particular, nonempty bipartite graphs X of order > 3 are never 
vertex-primitive (since X) is disconnected). 

A graph is distance-transitive if Aut(X) is transitive on the set of ordered pairs 
of vertices at distance ¢ for every t < diam(X). Nice examples are the Platonic 
solids (fig. 1 in chapter 1, section 1), Heawood’s, Petersen’s, and Coxeter’s graphs 
(figs. 4, 8, 9 in chapter 1, sections 1 and 4). 

Vertex-primitivity is a very severe restriction on the dulomorphism group, as seen 
by the following deep result previously known as the “Sims conjecture (1967)”. 


Theorem 1.1 (Cameron, Praeger, Sax! and Seitz 1983). There exists a function f 
such that if a vertex-primitive digraph has out-degree k then the vertex-stabilizer 
in the automorphism group has order < f(k). 


This result immediately implies that there is only a finite number of vertex- 
ptimitive distance-transitive graphs of any fixed degree. However, this second state- 
ment remains valid even without the vertex-primitivity condition (see section 5.2). 


Automorphism groups, isomorphism, reconstruction 1453 


The automorphism group of a finite tournament T has odd order, since otherwise 
it would contain an involution (an element of order two), which would then illegally 
reverse at least one edge. This harmless looking observation implies, by the Feit— 
Thompson theorem, that Aut(T) is solvable, a fact with far reaching consequences, 
including algorithmic ones (cf. the end of section 6.6). Here we state an immediate 
corollary (cf. chapter 12 for the definitions). 


Proposition 1.2. Let T bea tournament with n vertices. 
(a) If T is vertex-transitive then n is odd. 
(b) If T is vertex-primitive then n is an odd prime power. 


Proof. Part (a) is straightforward: the in- and out-degrees must be equal. As for 
part (b), let N be a minimal normal subgroup of Aut(T). Then N is transitive 
(since Aut(T) is primitive); it is abelian (since Aut(T) is solvable); and it is char- 
acteristically simple (i.e. the direct product of isomorphic simple groups) (since it 
is minimal). Therefore N = Zi (k > 1, p prime). A transitive abelian group being 
regular, we conclude that n = |N| = p*. (We note that T is a Cayley digraph of 
N) 0D 


While there is no hope to classify all flag-transitive graphs, a simple description 
of all edge-transitive tournaments exists. (For directed graphs, edge- and flag- 
transitivity mean the same.) Let q = p* be an odd prime power, q = —1 (mod 4). 
The Paley tournament P(q) has the field GF(q) for its vertex set; an edge goes from 
x toy (x # y) if x — y is a square. The group of affine transformations x + ax +b 
(a,b € GF(q),a # 0 a squarc) acts transitively on the edges of P(q). 


Theorem 1.3 (Kantor 1969). 
(a) Every edge-transitive tournament with n > 2 vertices is Paley. 
(b) Aut(P (q)) consists of the affine semilinear transformations x > ax“ + b where 


a,b € GF(q), a 4 0 is a square, and a:x + x!" (0 <j < k —1) is an automorphism 
of GF(q). 


Proof. Let T be edge-transitive. Since n > 2, T must be vertex primitive and there- 
fore n = p*, p an odd prime. The stabilizer of a vertex x acts transitively on the 
tournament induced by the (1 — 1)/2 out-neighbors of x; hence n = —1 (mod 4). 
Let N be a minimal normal subgroup of G = Aut(T); then, as before, N can be 
identified with the vertex set of T. Let 7:x ++ x~' (x € N); then 7 is an antiau- 
tomorphism of T (reverses every edge). Therefore G has index 2 in the doubly 
transitive group H = (G,7). All solvable doubly transitive groups have been de- 
termined by Huppert (cf. Huppert 1957); apart from a finite number of exceptions 
of degrees 3”, 5*,7*, 117, 237, 3*, they all are subgroups of the group I'A,(p*) of 
semiaffine (affine semilinear) transformations of GF(p*). The exceptional cases 
are ruled out because k must be odd (since n = ~1 (mod 4)). This, in particular, 
proves part (b). Aut(P(p*)) is the unique subgroup of index 2 in "A; (p*), hence 
G < Aut(P(p*)). Since both G and Aut(P(p*)) have rank 3 (cf. chapter 12), either 
T or its converse agrees with P(p*), which is self-converse. 1 
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The rth residue digraph P (q,r) is defined for prime powers q and integers r > 2 
such that r|(q —1). The vertex set of P(q,r) is GF(q); an edge joins x to y if 
x —y is arth power in GF(q). This digraph is undirected if either q or (q — 1)/r is 
even. (The Paley graphs are the quadratic residue graphs (r = 2; q = 1 (mod 4)). 
The Clebsch graph is P(16,3).) The affine linear group A,(q) is flag-transitive 
on P(q,r). It is not true in general that Aut(P(q,r)) is semiaffine; e.g., if g = q) 
and r = qo +1 then P(q,r) is a disjoint union of cliques; if q = q} and r = qy + 
1, then the neighbors of 0 form a quadric and the graph admits the orthogonal 
group. However, the Paley graphs have semiaffine automorphism groups. This is 
a consequence of the following theorem of Carlitz (1960) and McConnel (1963): 
Let q be a prime power, r\q — 1, and let f be a map of GF(q) to itself such that for 
every x,y € GF(q), x # y, the element (x — y)~' (f(x) — f(y)) is an rth power. Then 
f is semiaffine. (See also Bruen and Levinger 1973b.) 

A stronger result holds when q is a prime. 


Theorem 1.4. If X is an edge-transitive regular graph of prime order p without 
isolated vertices then X is either complete or an rth residue graph for some r|(p — 
1)/2. In the latter case, Aut(X) = A,(p). 


Proof. X cannot be bipartite (p is odd), hence it is vertex-transitive and (being a 
Cayley graph of the abelian group Z,, cf. Corollary 3.6), in fact, flag-transitive. Let 
G = Aut(X) < S,. If G is not solvable, then it is doubly transitive (cf. Burnside 
1911; cf. also Huppert 1967, p. 609), hence X is complete. If G is solvable then 


G < A,(p) (Galois; see Huppert 1967, p.163). A glance at the structure of A;(p) 
completes the proof. O 


Graphs with higher degrees of symmetry will be discussed in section 5. Distance- 
transitive graphs have been defined above. We define another important class here. 

An s-arc in a graph is a sequence (Xg,...,Xs) of vertices such that: (a) x;_, 
and x; are adjacent; (b) x; ; # xj.1. The graph X is s-arc-transitive, if Aut(X) 
acts transitively on the set of s-arcs. (Note: 1-arc-transitivity is the same as flag- 
transitivity.) Distance-transitivity implies |g /2|-arc-transitivity, where g is the girth. 

Often we are interested in the action of some subgroup G < Aut(X) on vertices, 
edges, flags, etc. If this action is transitive (regular), we say that G is vertex- 
transitive (vertex-regular, resp.), etc., on X. 

Graphs with relatively low degrees of symmetry are easy to construct. Every 
Cayley graph (see section 2) is vertex transitive. There is an abundance of edge- 
transitive digraphs and even of 2-arc-transitive graphs, as indicated by the following 
result. A map f:(V, E) — (W, F) between two finite digraphs is a k-fold covering 
if f is a homomorphism (maps vertices to vertices, edges to edges, and preserves 
incidences); every vertex and edge of (W, F) has exactly k preimages; and f is a 
local isomorphism, i.e. x and f(x) have the same indegree (out-degree, resp.) for 
everyx eV. 


Theorem 1.5 (Babai 1985). (a) Every finite regular digraph has infinitely many 
edge-transitive finite covering digraphs with the same number of connected com- 
ponents. 


5 ski gsr ecm Seep tes iz elise ocak i, =" Seagate ncn einem e's epmtaemmmanehaaandtimapicipmaa, 
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(b) Every finite regular graph has infinitely many 2-arc-transitive finite covering 
graphs with the same number of connected components. 


It follows by a result of Godsil (1982) that the minimal polynomial of every 
digraph divides that of an edge-transitive digraph, hence the adjacency matrices of 
infinitely many edge-transitive digraphs are not diagonalizable. 

Although graphs with higher symmetry are much more difficult to construct (cf. 


section 4), covering graphs are helpful in moving from an isolated example to 
infinitely many. 


Theorem 1.6 (Biggs 1974, chapter 19), A finite connected s-arc-transitive graph has 
infinitely many finite connected s-arc-transitive covering graphs. 


1.2. Reconstruction from line graphs 


We illustrate the point made in the last sentence of the introduction by a classical 
example. 


Theorem 1.7 (Whitney 1932). Connected graphs X with >5 vertices are strongly 
reconsiructible from their line graphs L(X) (within the class of all graphs). 


(Whitney proved the result for finite graphs; it was extended to infinite graphs 
by Bednarek (1985), using Rado’s selection principle (chapter 42, section 3).) In 
other words, every isomorphism L(X) —> L(Y) is induced in the natural way by a 
unique isomorphism X —» Y (cf. Lovasz 1979a, p.507). This, in particular, means 
that if the connected graph X has at least 5 vertices then Aut(X) = Aut(L(X)). 


Corollary 1.8. Let P denote the Petersen graph. 
(a) Aut(P) & Ss. 
(b) P is distance transitive and 3-arc-transitive. 


Proof. The complement of P is L(Ks). O 


One can generalize this result to the Kneser graphs KG(n,r) (n > 2r + 1). Recall 
that the vertex set of KG(u,r) is the set of r-subsets of an n-set; disjoint subsets 
correspond to adjacent vertices. 


Proposition 1.9. 
(a) For n > 2r+1, Aut(KG(n,7)) = S,. 
(b) KG(n,r) is distance-transitive. 
(c) The “odd graph” O, = KG(2k — 1,k — 1) is exactly 3-arc-transitive. 


For the proof of part (a), we have to consider a reconstruction problem for 
hypergraphs. The line graph L(H) of the hypergraph H = (V, E) has vertex set 
E£; two members of & are adjacent in L(//) if they intersect. For a set A, let [A]’ 
denote the complete r-uniform hypergraph on A, consisting of all r-subsets of A. 
The Kneser graph KG(n,r) is the complement of L{{A]') where |A| = n. Part (a) 
of Proposition 1.9 is thus an immediate consequence of the next observation. 


1456 L. Babai 


Proposition 1.10 (Berge 1972, Fournier 1974). The complete r-uniform hypergraphs 
with at least 2r + 1 vertices are strongly reconstructible from their line graphs. 


Proof. By the Erdés-Ko-Rado theorem (see chapter 24), the largest cliques of 
L{{A]’) are in one-to-one correspondence with the elements of A. This guarantees 
that every isomorphism L([A]') — L((B]‘) is induced by a bijection A— B. O 


This is a special case of the following sufficient condition of reconstructibility. 


Theorem 1.11 (Erdés and Fiiredi 1980). Let H be an r-uniform hypergraph on n 
vertices. If n > 2r+1 and every vertex of H has degree greater than 


n-1l n-r-1 
vn= (7-1) -( roe \a1, 
then H is strongly reconstructible from L(H). 


The degree bound v(n,r) is tight for every r 22 and n> 2r?. The quantity 
v(n,r) comes from the Hilton—Milner theorem (chapter 24, Theorem 5.8). In the 
particular case when all pairs of edges intersect in at most one point, the bound 
of Theorem 1.11 can be greatly improved. 


Theorem 1.12. Let H be an r-uniform hypergraph on n vertices such that every pair 
of edges intersects in at most one point. If every vertex of H has degree greater than 
r? —r+1, then H is strongly reconstructible from L(H). 


The proof follows immediately from Deza’s theorem (1973) (cf. Lovasz 1979a, 
problem 13.17): If every pair of edges of an r-uniform hypergraph H = (V, E) has 
exactly A points in common then either H is a sunflower (all edges have the same 
A points in common), or [E| <r? —r +1. 


Corollary 1.13. Let S, S, and S) be Steiner triple systems of order > 15. Then: 
(a) Aut(L(S)) & Aut(S). 
(b) If S$ % S> then L(S,) % L(S»). 


We shall use part (a) of this corollary to construct strongly regular graphs with 
arbitrary prescribed automorphism groups (Theorem 4.3). Part (b) implies the 
existence of a large number of isospectral graphs: nonisomorphic graphs with the 
same characteristic polynomial. The existence of such families shows that the char- 
acteristic polynomial, though a useful invariant of graphs, is far from complete. (A 
complete invariant F(X) is one from which X is (weakly) reconstructible.) 


Corollary 1.14. For infinitely many values of n, there exists a set of n™'+0())/2 
isospectral graphs on n vertices. 


Indeed, the parameters of the strongly regular graph L(S) (chapter 15) and 
therefore its spectrum are uniquely determined by the number of vertices n = 
v(v — 1)/6, where v is the number of vertices of the Steiner triple system S. The 
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estimate of the number of Steiner triple systems required, v’'\'#°0))/®, is due to 
Alekseiev (1974) and Wilson (1974), combined with Van der Waerden’s permanent 
conjecture (now the theorem of Egorychev and Falikman, see chapter 22, section 
16.1). : 

A more direct proof of Corollary 1.14 (also based on the Permanent conjecture) 
uses Latin square graphs (LSGs). The LSG associated with a k x k Latin square 
(LS) (chapter 14) has k? vertices corresponding to the cells of the Latin square; 
two cells are adjacent in the graph if they are in the same row, or in the same 
column, or they have the same entry. For k > 5, the only k-cliques in an LSG 
are those corresponding to rows, columns, and identical entries. From this it is 
easy to deduce that (for k > 5) the LS is strongly reconstructible from its LSG. 
(Isomorphisms of Latin squares have to be defined carefully: row indices, column 
indices, and entries play interchangeable roles; so the automorphism group is a 
subgroup of S, 0.83.) 


1.3. Automorphism groups: reduction to 3-connected graphs 


Probably the first nontrivial class of graphs of which the automorphism groups 
have been studied are finite trees (Jordan 1869). The first observation is that every 
tree has a center, which is either a vertex or an edge and is fixed under every 
automorphism. This reduces the problem to rooted trees (the root is fixed by 
definition). Automorphism groups of rooted trees can be determined recursively: 
delete the root, designate its neighbors to be roots of the remaining branches, 


and apply formula (1) to the forest of rooted trees obtained. The conclusion is as 
follows. 


Proposition 1.15 (Jordan 1869). The finite group G is isomorphic to the automor- 
phism group of a finite tree if and only if G © ‘W, where the class 'W of finite groups 
is defined inductively as follows: 

(a) 1} e W; 

(b) f G,H EW thnGxHeEewW; 

(c) if Ge W and m 22 then G1?S,, € W. 


In fact, not only the abstract group structure but the permutation action of the 
automorphism groups of trees can be deduced from these considerations. 

Using the block-cutpoint tree T of a 1-connected graph X, similar considerations 
reduce the determination of Aut(X) to the automorphism groups of its blocks via 
a slight generalization of wreath products. If the root of 7 is a cutpoint, we split 
it and combine, via eq. (1), the groups of the (rooted) components. If the root is 
a block, we assign colors to the vertices of that block to indicate the isomorphism 
type of the incident branch; apply an arbitrary color preserving automorphism 
to the block, and move the branches in a wreath-product-like fashion (Robinson 
1970). 

A canonical decomposition of 2-connected graphs to their 3-connected “compo- 
nents” also exists. 
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We briefly indicate the idea. Let us call a multigraph basic if it is either 3- 
connected or a cycle or it has just two vertices and a set of > 2 parallel edges 
between them. A “bipolar multigraph” is a multigraph with two distinct specified 
endpoints. Call a bipolar multigraph basic if it becomes a basic multigraph after 
adding a new edge joining the two endpoints. 

Let us now take a basic graph, and repeat the following construction: simulta- 
neously replace every edge by a basic bipolar multigraph. 

The result is that every 2-connected graph arises in a canonical way in this 
manner. Canonicity means that all isomorphisms between two 2-connected graphs 
induce isomorphisms of each corresponding level of this construction (and in par- 
ticular it induces an isomorphism of the rooted trees representing the hierarchy of 
the basic graphs used). 

Such a canonical hierarchy of basic graphs is referred to as the decomposition 
to 3-connected components. 

A generalization of wreath products (Babai 1975) allows a description of the 
automorphism group of a 2-connected graph in terms of the automorphism groups 
of its 3-connected components with the edges of these components colored and 
oriented appropriately. 

A very efficient (lincar time) algorithm for the canonical decomposition to 3- 
connected components was given by Hopcroft and Tarjan using breadth-first search 
(Hopcroft and Tarjan 1973); a parallel algorithm was found by Miller and Ra- 
machandran (1992). 

Problems of great depth arise in the study of the automorphism groups of infinite 
trees. Tits (1970) studied the full automorphism groups of (vertex-colored) trees. 
Groups acting on trees without inverting an edge have been characterized by 
H. Bass and J.-P. Serre. This theory will be touched upon in sections 3.7 and 3.11. 


1.4. Automorphism groups of planar graphs 


Finite planar graphs form one of the few comparatively rich classes of graphs of 
which the automorphism groups have been satisfactorily determined, both from the 
algebraic (Babai 1975) and the algorithmic (Hopcroft and Tarjan 1972; Hopcroft 
and Wong 1974) points of view. 

Every finite group of isometries of the euclidean 3-space has a fixed point and 
can therefore be identified with a group of isometries of the 2-sphere. Every sense- 
preserving transformation is a rotation, and every sense-reversing transformation 
is a rotary inversion, i.e. a rotation followed by a central inversion. 

There are two infinite families and 3 sporadic examples of finite rotation groups 
of the 2-sphere: the rotation groups of the regular k-gonal pyramids (the cyclic 
group Z,), the regular k-gonal prisms (the dihedral group D,), the tetrahedron 
(the alternating group A,), the cube (S,), and the dodecahedron (A3;) (see the 
figures in chapter 1, section 1). The list is understood to include the degenerate 
cases k < 2. 

The finite isometry groups of the 2-sphere, other than the rotation groups, can be 
obtained in one of two ways as follows. Each rotation group G can be extended 
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to GUGr = G x Z where 7 = —] ‘is the central inversion. Moreover, if G is a 
rotation group with a subgroup H of index 2, then the group Gt = HU(G\ H)r 
is another isometry group. Note that G* = G, but the geometric realization is 
different: for instance, from the rotation group of the cube we obtain the full 
isometry group of the tetrahedron. (See, e.g., Fejes-Toth 1965, Coxeter 1961.) 


Theorem 1.16. Every 3-connected planar graph X has an embedding on the sphere 
such that all automorphisms are realized by isometries of the sphere. 


This is a consequence of Whitney’s (1932) theorem that 3-connected planar 
graphs are uniquely embeddable on the sphere (cf. chapter 2), combined with 
the fact that all finite homeomorphism groups of the sphere are topologically 
equivalent to a group of isometries (Kerékjart6 1921, Eilenberg 1934). A stronger 
version of Theorem 1.16 was obtained by P. Mani. 


Theorem 1.17 (Mani 1971). Every 3-connected planar graph X can be realized as 
the 1-skeleton of a convex polytope P in R* such that all automorphisms of X are 
induced by isometries of P. 


Polyhedral groups are the isometry groups of convex polytopes and their sub- 
groups. Viewed in their action on R’, they coincide with the finite isometry groups 
listed above. Either of the above results, combined with the reduction process in- 
dicated in the previous section, yields a description of the automorphism groups 
of planar graphs in terms of generalized wreath products of symmetric groups and 
polyhedral groups. Two easily stated consequences: If X is planar then Aut X’ has 
a subnormal chain Aut(X) = Go > G; > ---> G,, = 1 such that each quotient group 
G,_1/G; is either cyclic or symmetric or As. If X is 2-connected and |Aut(X)| is 
odd then Aut(X) is cyclic (Babai 1975). We conjecture that the first of these state- 
ments remains valid for graphs embeddable on an arbitrary fixed surface % (cf. 
chapter 5) with A; replaced by a finite list, depending on & (cf. Babai 1973, 1974a). 


1.5. Matrix representation. Eigenvalue multiplicity 


A mechanical system is often represented by a self-adjoint operator A; and its 
symmetries by a group G of unitary operators (acting on a real or complex Hilbert 
space H). The fact of symmetry is expressed by the equation AP = PA for each 
P € G.IfH has finite dimension (or more generally, its spectrum is discrete), then 
it is the orthogonal direct sum of the cigensubspaces H, = {u € H: Au = Au} for 
all eigenvalues A of A. 

If the operators B and C commute, then the eigensubspaces of B are invariant 
subspaces for C. In particular, one can refine the decomposition H = >” H, to 
an orthogonal decomposition into subspaces, irreducible under the action of G. 
This way each irreducible constituent of G falls into an cigensubspace, forcing 
“degeneracies” (multiple eigenvalues) to occur, and more importantly, the vectors 
in an orthonormal base of H are classified according to the irreducible constituents 
of G. This approach, introduced in a seminal paper by Wigner (1927) has been 
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used extensively both in classical and in quantum mechanics (cf. Wigner 1959, 
Hamermesh 1962). The classification of eigenvibrations of molecules using the 
character tables of their symmetry groups (also due to Wigner 1930, cf. Schonland 
1965) is particularly instructive because in this case dim H < co and the matrix A 
is a variant of the “adjacency matrix of the molecule”. 

Let now X denote a graph with edges weighted with real numbers; and let A 
be its adjacency matrix; so the entry a;; is the weight of the edge {i,j}. Then A 
is a symmetric real matrix which acts on the space H = R” (n is the number of 
vertices). The automorphisms of A are represented by precisely those permutation 
matrices P which commute with A. 

Reversing Wigner’s approach, we shall indicate how to use spectral information 
on A to infer properties of the group G = Aut(X). Let A,,...,Am be the eigenval- 
ues of A; let m; be the multiplicity of A; (> m; = n). Let G; denote the restriction 
of G to the cigensubspace /7,,. Then G is a subdirect product of the G;. (A subdi- 
rect product is a subgroup of the direct product which projects onto each factor.) 
This proves part (a) of the following result. 


Theorem 1.18. Let G = Aut(X) for an edge-weighted graph X with eigenvalue mul- 
tiplicities my <--- < m,. 

(a) (Godsil 1978) G is the subdirect product of groups G,,...,G,, where G; is a 
subgroup of the orthogonal group O(mj,); 

(b) (Godsil 1978) |G;] < n™; 

(c) (Godsil 1978) if X is vertex-primitive then |G| < n'"?; 

(d) (Babai, Grigoryev and Mount 1982) if X is vertex-transitive then |G| < n™ '; 
and more generally, the restriction of G to any of its orbits has order < n™', 


To see part (b), let S be the projection of the standard basis of H = R" to Aj,; 
and let S’C S be a base of Hy,. Then each member of G; is determined by its 
restriction to S’ which is a map S’ > §. The number of such maps is < n™. Part 
(c) follows by observing that the projection of V to each eigensubspace defines an 
invariant partition of V. Hence if X is vertex-primitive of degree d > 1 then this 
partition must be trivial and G acts faithfully on cach eigensubspace of dimension 
# 1. But the only one-dimensional eigensubspace of a vertex-transitive graph is 
the one corresponding to A = d. Part (d) is less immediate; an algorithmic version 
of it is used in Babai et al. (1982) to deduce an n”!°( algorithm for testing 
isomorphism of graphs with eigenvalue multiplicity bounded by m. 

Part (a), too, has some appealing consequences. Let mm = m, be the maximum 
multiplicity of eigenvalues. Noting that O(1) = Z, we see that if all eigenvalues of 
a graph are distinct, then its automorphism group is an elementary abelian 2-group 
(Mowshowitz, Petersdorf and Sachs 1969, cf. Cvetkovié et al. 1980). Further, ifm < 
3 then Aut(X) is solvable. This is immediate for m = 2 since every finite subgroup 
of O(2) is cyclic or dihedral; but among the finite subgroups of O(3), there are 
two nonsolvable ones: the group of rotations of the icosahedron (= As) and its 
full group of congruences (& As x Z,). These were ruled out by Cameron (1983) 
via a closer look at the characters of As. 
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Our last remark concerns factors of the characteristic polynomial. Let G < 
Aut(X) for some weighted digraph X and consider the weighted quotient graph 
Y = X/G. The vertices of X/G are the orbits of G; the. weight of the directed 
edge (A,B) of Y is the sum of weights of all edges (u,v) for some fixed u € A 
over all u € B. It is easy to see that the characteristic polynomial of Y divides that 
of X. In particular, if the characteristic polynomial of a digraph X is irreducible 
then X is asymmetric (|Aut(X)| = 1). (Mowshowitz, cf. Cvetkovié et al. 1980). 


1.6. Asymmetry, rigidity. Almost all graphs. Unlabelled counting 


An excellent exposition of the subject of this section is given by Bollobds (1985, 
chapter IX). 

A graph is called asymmetric if it has no nontrivial automorphisms; it is called 
rigid if it has no nontrivial cndomorphisms. (Some authors use the term “rigid” 
to describe what we call asymmetric.) Construction of asymmetric or rigid graphs 
and other structures with given properties is often the basis of the construction of 
such structures with given automorphism group or endomorphism monoid, resp. 
(Cf. section 4.1.) A notable result in this area is that there exists a rigid graph 
on every infinite vertex set (Vopenka et al. 1965, cf. Hedrlin and Lambek 1969). 
Finite rigid graphs exist on ” vertices for any n > 10 (Hedrlin and Pultr 1965); 
asymmetric graphs exist for n > 6. Asymmetry/rigidity is actually the typical be- 
havior of finite graphs. It was proved by Polya (1937), Erdés and Rényi (1963) that 
a random graph is asymmetric with probability 1 — (5)2~"-?(1 + o(1)). The dom- 
inant part of the error-term comes from the graphs which admit a transposition 
automorphism (a pair P of vertices with identical neighborhood outside P). The 
asymptotic expansion can be continued to include terms describing the probabili- 
ties of automorphisms with bounded supports. A strong algorithmic version of this 
result will be mentioned in section 6.4. 

It is not difficult to upgrade the proof to yield that almost all graphs are rigid. In 
this case the error term is O(n?(3/4)~"), dominated by the possibility that the neig- 
borhood of some vertex v includes the neighborhood of some vertex w, allowing 
an endomorphism to map w to v while fixing all other vertices. 

Although n-vertex asymmetric trees exists for every n > 7, random trees are 
typically not asymmetric. Indeed for any finite rooted tree 7, almost all labeled 
trees have T as a limb (Schwenk 1973). In particular, large numbers of cherries 
(pairs of end-vertices with a common neighbor) occur almost always. 

Nontrivial trees (and more generally, bipartite graphs, and indeed perfect graphs) 
are never rigid (they can be mapped to their largest clique). 

E. M. Wright refined the “almost sure asymmetry” results to show that asymme- 


try is typical for graphs with density above the connectedness threshold (cf. chapter 
6). 


Theorem 1.19 (Wright 1971). Let m(n) = }nlnn + my(n). Then the probability that 
a random graph with n vertices and m(n) edges is asymmetric tends to 1 if s(n) — co 
assuming m(n) < }(5); and this probability tends to 0 if s(n) — —oo. 
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The reason of the second statement is obvious: those graphs have, with proba- 
bility approaching 1, an unbounded number of isolated vertices. If we rule out this 
possibility, even sparser graphs will be typically asymmetric: for fixed r > 3, the 
probability that a random r-regular graph is asymmetric tends to 1 (Bollobas 1982, 
McKay and Wormald 1984, Wormald 1986). 

The results establishing “almost always asymmetry” mentioned above are valid 
for labeled as well as for unlabeled graphs; the latter is a substantially stronger 
statement with important consequences to counting unlabeled objects. We shall 
formalize the connection below. 

Let € be a class of finite praphs (or digraphs, or other structures), closed un- 
der isomorphisms, and let €(#) be the set of those members of € with vertex 
set [n] = {1,...,n}. Let be a graph property (i.e. an isomorphism-closed class 
of graphs). We say that “almost all labeled members of € have property #” if 
limy-.00 [PN 6(n)|/|6(n)| = 1. The term “almost all unlabeled members of €” is 
used analogously except that isomorphism classes rather than individual graphs 
are counted. This annoying distinction disappears if almost all unlabeled members 
of € are asymmetric: under this condition, any graph property will hold for almost 
all unlabeled members of © if and only if it holds for almost all /abeled members. 

The statement that “almost all unlabeled members of € are asymmetric” is 
equivalent to the following: 


“the expected number of automorphisms of a random (2) 
labeled member of @ is t + 0(1).” 


This equivalence follows from the observation that the number of unlabeled 
graphs (isomorphism classes) in @(n) is exactly |€(n)la(n)/n!, where a(n) = 
Dive ein [Aut(X)|/|@(n)| is the expected order of the automorphism group of a 
random labeled member of ©. (This follows from the Orbit counting lemma, a.k.a. 
“Burnside’s lemma”, see chapter 21, Lemma 14.3.) 

By the results mentioned, (2) is valid for the class of all graphs, for graphs with 
m(n) edges as in Wright’s theorem (y — oo), as well as for regular graphs of given 
degree r > 3. 

Structures satisfying stronger regularity constraints are often difficult to count. It 
seems likely, for instance, that almost all strongly regular graphs are asymmetric, 
but this may be difficult to prove. It has been shown, however, that almost all 
(unlabeled) members of the following two classes of strongly regular graphs are 
asymmetric: the line graphs of Steiner triple systems, and the Latin square graphs 
(cf. section 4.1) (Cameron 1979, Babai 1979a). 

While almost all graphs are asymmetric, one might be interested in what can be 
said about the graphs known to admit some automorphisms. Related questions will 
be considered in sections 4.3 and 4.4; here we mention a result of Cameron (1980b). 


Theorem 1.20 (Cameron). For a finite group G let €(G) be the class of those graphs 
X admitting a group isomorphic to G as a subgroup of Aut(X). Let a,(G) denote 
the proportion of those n-vertex labeled members of € which have Aut(X) = G. 
Then: 


) 
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(i) the limit a(G) := lim)... @,(G) always exists and is rational. 
(ii) a(G) = 1 iff G is the direct product of symmetric groups. 
(iii) For infinitely many groups, including all abelian groups with exponent > 3, 
a(G) = 0. 
(iv) For metabelian groups, the values of a(G) are dense in {0, |]. 


While almost all finite graphs are asymmetric, the situation changes to its oppo- 
site when we consider countably infinite graphs. Let us generate a random graph 
on a countably infinite vertex set by deciding independently and with probability 
4 whether or not to join two vertices. Then with probability 1, we obtain a graph 
isomorphic to one specific graph R, the Rado graph (Erdés and Rényi 1963), dis- 
cussed in section 5.3. We should mention that |Aut(R)| = 2"*, and “almost all” 
automorphisms of R are conjugates (cf. Theorem 3.17). 

More generally, the number of automorphisms of a countable graph (or any 
countable structure over a locally finite language, cf. section 5.3) is always finite, 
countable, or 2%». A countable graph (structure) X has 2° automorphisms if and 
only if every finite subset of V(X) is pointwise fixed by some nontrivial automor- 
phism. 


2. Graph products 


In this section we introduce the most important graph products, indicate their com- 
binatorial significance, and address their automorphism and factoring problems. 

Given two graphs X; = (V;, £;) (i= 1,2), a product graph Y =(W,F) =X, * 
X, can be defined in a variety of sensible ways. Those four which appear most 
frequently in the literature are the lexicographic, the Cartesian, the categorical, 
and the strong products. In each case, W = V; x V2 (Cartesian product). Each of 
the products is associative, and three of the four are commutative in the sense that 
the map (v;,v2) +» (v2,v;) is an isomorphism between X, * X) and X> * X,. (The 
lexicographic product is not commutative.) The 1-vertex graph is a (two-sided) 
identity in three cases (exception: the categorical product; in that case it is natural 
to admit loops and the one-vertex graph with a loop becomes the identity). We 
say that a graph P is a prime with respect to a product and a class © of graphs if 
P is not isomorphic to the product of two non-identity graphs within © and is not 
itself the identity. 

Next we define the adjacency relation in each product. Let u;,v; © V; and 
(i= 1,2) w = (uj,u), Z = (¥1,02) € W. Then w and z are adjacent (a) in the 
lexicographic product Y = X,{X>| if either (,u,) € Ey and uz 4 v2, or uy = 4 
and (u2,v2) € E; (b) in the Cartesian product Y = X, x X, if either u; = v, and 
(uz, 02) € Ez or uz = v2 and (u,,v,) € Ej; (c) in the categorical product Y = X| - X2 
if (uj,v,) € E, and (uz, v2) € E>; (d) the edge set of the strong product X, ® X> is 
the union of the edge sets of the Cartesian and the categorical products. 

Observe that the n-cube is the Cartesian product of n copies of Kz; more gen- 
erally, Cartesian products of paths are grids. Hamming graphs can be defined as 
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isometric subgraphs of Cartesian products of complete graphs (cf. Graham and 
Winkler 1985). 

Categorical products are the products in the category theoretic sense. They give 
rise to some of the deepest structural questions (cf. McKenzie 1971, Jonsson 1981). 
Strong products, their close relatives, are tamer in many ways. 

Lexicographic products occur naturally in combinatorial constructions; we shall 
mention examples below. : 

Some graph invariants of certain products are easily computed from those of 
the factors; others pose important open questions. We mention two of the lat- 
ter kind. The first one concerns the chromatic number chr(Y) of the categor- 
ical product Y = X,-X. Since Y has a homomorphism to each factor, clearly 
chr(Y) < min {chr(X,), chr(X)}. Hedetniemi’s conjecture asserts that for finite 
graphs we have equality here (cf. Greenwell and Lovdsz 1974). (This is false for 
uncountably infinite graphs, Hajnal 1985.) 

The second problem concerns the independence number a(Y ) of the strong prod- 
uct Y = X, @ X2. Clearly a(Y) > a(X,)a(X2) (supermultiplicativity). Let X* = 
X @---@X denote the kth strong power of the graph X. Supermultiplicativity 
implies that the limit @(X) = lim, _,,.(a(X*))'/* always exists; this quantity is the 
Shannon capacity of X (chapter 31, section 6; cf. Knuth 1994). Its value is unknown 
even for as simple a graph as C;, the cycle of length 7. Even the case of Cs; was 
open for decades; it was solved by Lovasz as a special case of the following result: 
If X is a vertex-transitive self-complementary graph with n vertices then O(X) = /n 
(Lovasz 1979b). (This class includes the Paley graphs (section 1.1).) 

Cartesian products of cycles occur as Cayley graphs of abelian groups. Their 
genus has been studied in this context (cf. section 3.9). 

Some useful observations regarding the lexicographic product: (1) both the in- 
dependence number a(X) and the clique number w(X) are multiplicative under 
lexicographic products (this fact has a curious application to constructive Ramsey 
graphs, Abbott 1972). The following inequality holds for the chromatic number of 
the lexicographic product (Linial and Vazirani 1989): 


(chr(X)) — 1) - chr(X2)/In|V(X1)| < chr(X1[X2]) < chr(X)) + chr(X2). (3) 


Sometimes the study of vertex-transitive graphs reduces to the study of Cayley 
graphs via the following observation: If X is vertex-transitive then both X|K,,| and 
X(K | are Cayley graphs for a suitable m (Sabidussi 1964). (Examples include the 
study of isoperimetry, cf. Theorems 3.38, 3.41.) 

Among the nicely behaved parameters we mention the spectrum. Let {A;} and 
{u;} be the multisets of eigenvalues of X,; and X>, resp. Then {yj + |X2|A;}, 
{A; + uj}, {Aim;}, and {Aju; + A; + w;} are the respective multisets of eigenvalues 
of the lexicographic, Cartesian, categorical, and strong products. All these products 
share a base of orthonormal eigenvectors consisting of the pairwise Kronecker 
products of the orthonormal eigenbases of each factor. The Kronecker product of 


the adjacency matrices of X, and X> is the adjacency matrix of their categorical 
product. 
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2.1. Prime factorization, automorphism group 


Now we turn to the problem of unique prime factorization (UPF). We recommend 
the insightful survey by Imrich (1993) for more details. 

For commutative products of finite graphs, UPF is equivalent to the common 
refinement property. We say that a graph G has the common refinement property 
with respect to a product, if for any two representations [],,, A; = [[,., B, of G 
there exist graphs C;,; which satisfy Ap = []j<) Cp,j and By = [Tic; Cig. 

Let V =V, x--- x V, be a Cartesian product decomposition of the vertex set 
V. For v € V, let V? denote the set of vertices differing from v in the ith coordinate 
only. Suppose this decomposition of V corresponds to a decomposition of the graph 
X =(V,E) with respect to some commutative product; and V = W, x --- x Wy 
corresponds to another decomposition. We say that the strict common refinement 
property (SCR) holds if the intersections V? M W? with at least two vertices are 
exactly the C?, with respect to the factors of a common refinement. We say that X 
has the SCR property w.r. to a certain product if any pair of product decompositions 
of X has this property. In this case it follows that the multiset of prime factors is 
strongly reconstructible. In particular, X has UPF X = |] X; and Aut(X) is obtained 
from Aut(X;) via eq. (1) at the beginning of section 1. 

For disconnected graphs, UPF does not hold for any of our commutative prod- 
ucts, as seen from the identity (1+ x + x7)(1+2x°) = (1+x? +.x4)(1+ x). (Plug ina 
connected prime graph for x and interpret + as disjoint union.) 


fed 


2.2. The Cartesian product 


The product of connected graphs is connected. 

Every connected graph has UPF in a strong sense (Sabidussi 1960) which we 
now state. Every Cartesian product decomposition X = Y, x --- x Y, of the graph 
X induces an equivalence relation o(Y\,..., ¥;) on E(G); equivalent edges cor- 
respond to edges of the same Y;. It turns out that if X is connected then the 
intersection of two such product relations is a product relation again. The UPF 
corresponds to the intersection oy of all product relations. The strict common 
refinement property for connected graphs follows immediately, implying UPF and 
eq. (1) for the prime factors. 

Several algorithms are known to construct the UPF. The simplest one is due to 
Feder (1992) and runs in O(mn) (m = |E|, n = |V|). The most efficient algorithm, 
found by Aurenhammer et al. (1992), runs in O(m log n). 

Unique prime factorization holds for infinite graphs as well, and extends to the 
weak Cartesian product of infinitely many connected graphs (Imrich). For this re- 
sult and for the connections between prime factorization and isometric embeddings 
into Cartesian products we refer to Imrich (1989). 


2.3. The categorical product; cancellation laws 


First of all we have to admit loops so we at least have an identity graph for 
this product (single vertex with a loop). The categorical product of two connected 
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graphs is bipartite iff at least one of them is bipartite; and it is disconnected iff both 
factors are bipartite. Disconnected products cause non-unique prime factorizations,; 
but the connected non-bipartite graphs have UPF in the class of graphs with loops 
(McKenzie 1971). However, the strict refinement property does not hold, not even 
its consequence, eq. (1). 

A graph is thin if no pair of vertices has precisely the same set of neighbors. 
All factors of connected, non-bipartite thin graphs have the same properties. The 
strict common refinement property holds for connected, non-bipartite thin graphs, 
with its usual consequences: UPF and eq. (1) for prime decomposition (McKenzie 
1971). 

The inference A-C =B-C >A = B is called cancellation. The cancellation 
law is an immediate consequence of UPF; however, it may hold even if UPF fails. 
Lovasz (1971) proved that for cancellation, it suffices to require that the finite 
graphs A and B both have a homomorphism to C. Moreover A” = B" always 
implies A = B. In fact, Lovdsz (1972a) has shown, using an elegant inclusion— 
exclusion argument, that these statements hold in any finite category. 


2.4. Strong product 


For a simple graph X, let Xo be the graph obtained by attaching a loop at each 
vertex. Let Y° be obtained by removing all loops from Y. Now for two simple 
graphs X,Y, we have X @ Y = (Xo - Yo)°. Thus the strong product can be viewed 
as a tame special case of the categorical product (imagine a loop at every vertex). It 
follows that for connected simple graphs, UPF holds. Moreover, the strict common 
refinement property holds for connected graphs with thin complements. 


The UPF of connected graphs can be found in polynomial time (Feigenbaum 
and Schaffer 1992). 


2.5. Lexicographic product 


This product is right-distributive with respect to disjoint unions (all other prod- 
ucts discussed are distributive). It distributes complementation: X[Y] & X{Y]. The 
only pairs of graphs which commute with respect to the lexicographic product are 
(Kn, Km), (Kn, Km), and X",X" for any X. Moreover, the following cancellation 
law holds for finite graphs: if A[B] = X[Y| and |V(B)| = |V(Y)| then A & X and 
Bey. 

Let X + Y denote the disjoint union of the graphs X, Y and set X @Y =X+Y 
(Zykov sum). 

Observe that Kj[X[K4g] + Km| = (Kg[X]+Km)[K,], and, by complementation, 
K[X[Kg] ® Km] = (Ky[X] + Km)[Kg]. We call these operations elementary trans- 
positions. They preserve primality. 


Theorem 2.1 (Chang 1961, Imrich 1971). Any two prime factorizations with re- 


spect to the lexicographic product can be transformed into each other by elementary 
transpositions. 


For further references, cf. Jonsson (1981). 
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Clearly, Aut(X[Y]) < Aut(Y) ? Aut(X) (wreath product in its imprimitive action, 
cf. chapter 12): we may apply an automorphism of each copy of Y separately; and 
then, apply a single automorphism of X. We state a sufficient condition which 
guarantees equality here. 


Theorem 2.2 (Sabidussi). Let X,Y -be finite graphs. Assume X is thin if Y is dis- 
connected and X is thin if Y is disconnected. Then Aut(X[Y]) = Aut(Y) 2? Aut(X). 


Feigenbaum and Schaffer (1992) observed that recognizing composite graphs 
is polynomial-time equivalent to the graph isomorphism problem (section 6) and 
therefore not known to be solvable in polynomial time. 


3. Cayley graphs and vertex-transitive graphs 


3.1. Definition, symmetry 


In 1878, Cayley introduced a graphic representation of abstract groups. With a 
group G and a set S C G of generators he associated what we now call a Cayley 
color diagram: a directed graph with colored edges. The vertex set of the diagram 
I.(G, 8) is G. A color corresponds to each member of S; and the vertex g € G is 
joined to sg € G by an edge of color s. 

If we ignore colors, we obtain the Cayley digraph [(G,S). If in addition we 
ignore orientation of the edges, we obtain a simple graph: the Cayley graph ['(G, S). 
The degree of its vertices is |((S US~') \ {1}}. 

The Cayley graph I'(G,S) is connected because S generates G. Cycles in the 
Cayley graph correspond to relations among the elements of S. In particular, if S is 
a set of free generators of a free group G then ['(G, S) is a tree. The converse also 
holds if there are no involutions (clements of order 2) in S. (Involutions correspond 
to cycles of length 2 in the Cayley diagram, invisible in the Cayley graph.) More 
generally, if 1'(G, S) is a tree then G is a free product of infinite cyclic groups and 
of cyclic groups of order 2; the members of S generate these free factors. 

If no proper subset of S generates G, we call F(G,S) a minimal Cayley graph. 
Infinite groups do not normally have minimal sets of generators. If S can be linearly 
ordered such that no element of S is generated by its predecessors, we call P'(G, S) 
semiminimal. Every group possesses semiminimal Cayley graphs. 

For g € G, the right translation p,:G — G is defined by xp, = xg (x € G). The 
map p:g +> p, € Sym(G) is the right regular permutation representation of G. Its 
image Gp < Sym(G) is a regular permutation group (chapter 12). The following 
statements regarding the automorphism groups of Cayley diagrams and graphs 
are easy to verify. (Recall that automorphisms of colored directed graphs preserve 
colors and orientation by definition.) 


Proposition 3.1. (a) Gp = Aut(I.(G,S)) < Aut(I(G, S)). 
(b) (Sabidussi 1964) A connected graph X =(G,E) is a Cayley graph of the 
group G if and only if Gp < Aut(X). 
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Cayley graphs are thus vertex-transitive; the converse of this statement is false. 
Indeed, by Proposition 3.1(b), a connected graph X is Cayley precisely if Aut(X ) 
contains a regular subgroup. The smallest example of a vertex-transitive graph with 
no regular subgroup of automorphisms is Petersen’s graph. This is the first member 
KG(5, 2) of the infinite family of Kneser’s graphs KG(n,r)(n > 27 +1 > 5), most 
of which are not even remotely Cayley-tike. KG(n,r) has (‘) vertices identified 
with the set of r-tuples of an n-set; two vertices are adjacent ‘if the COL eaponding 
r-tuples are disjoint (cf. chapters 4, 24, 34). 


Theorem 3.2 (Kantor 1972, Godsil 1980a). (a) Kneser’s graph KG(n, r) is a Cayley 
graph precisely if r =2 and n is a prime power, n= —1 (mod 4), or r=3 and 
n & {8,32}. 

(b) If r 24 then, with some exceptions, the only transitive proper subgroup of 
Aut(KG(n, r)) is the one induced by the alternating group A,. Exceptions occur for 
r 5 when n © {12,24} and for r = 4 when n € {9, 11, 12,23, 24, 33}. 


The proof requires the following result of Livingstone and Wagner. A permu- 
tation group G < Sym(A) is t-transitive if it is transitive on the set of ordered 
t-tuples of distinct elements of A. G is t-homogeneous if it is transitive on the set 
of t-subsets of A. 


Theorem 3.3 (Livingstone and Wagner 1965). (a) /f G is t-homogeneous then it is 
(t — 1)-transitive. 
(b) If G is t-homogeneous and t > 5 then G is t-transitive. 


Proof of Theorem 3.2. Assume that KG(n,r) is a Cayley graph of some group 
G < Aut(KG(7,r)). Then, by Proposition 1.9(a), we may view G as a subgroup 
of S,,. Now, G acts regularly on the r-subsets, and is therefore r-homogeneous. By 
Theorem 3.3, it must be r-transitive if r > 5. The case r > 5 now follows because 
of the nonexistence of nontrivial 4- and 5-transitive permutation groups of degrees 
other than those listed. 

For r > 3, the result follows by inspection of the list of doubly transitive permu- 
tation groups (see chapter 12). (We remark that Kantor’s original proof did not 
rely on the classification theorem.) 

Finally, in the case r = 2, we observe that G < Aut(7’) for some tournament 7, 
and G acts as a regular group on the set of edges of T. It follows by Theorem 1 4 
that T must be a Paley tournament, hence v is a prime power and = —1 (mod 4). 
To see that in this case KG(7,r) is indeed a Cayley graph, let G be the group of 
affine transformations x +» ax + b, a,b, x € GF(n), a a square in GF(n). O 


As this example shows, it is often not easy to decide whether or not a given 
vertex-transitive graph is a Cayley graph. If the number of vertices is a prime 
power, the following partial information is useful. 


Theorem 3.4. (a) If G is a transitive group of degree p*, p prime, then the Sylow 
p-subgroups of G are transitive as well (Wielandt 1964, p. 6). 

(b) (Maru8it 1985) Every vertex-transitive (di)graph of order p*, k < 3, is a Cayley 
(di)graph. Counterexamples exist for k > 4. 
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Let VY denote the set of those positive integers m for which there exists a con- 
nected vertex-transitive graph of order n which is not a Cayley graph. Considerable 
effort has gone into determining the set V (see the survey Praeger 1990). It is clear 
that all multiples of a member of ¥ also belong to V (the complement of the dis- 
joint union of copies of a non-Cayley vertex-transitive graph is again non-Cayley). 
So we need to know the minimal members of V only (w.r. to divisibility). It is 
not known whether or not such minimal members can have an arbitrarily large 
number of distinct prime divisors. It is conjectured that almost all vertex-transitive 
graphs of order n are Cayley graphs. 

Cayley graphs are not edge transitive in general. (The triangular prism is an 
example.) In fact, their automorphism group often coincides with their group of 
definition (see the GRR problem in section 4.3). Here is a sufficient condition to 
guarantee added symmetry. 


Proposition 3.5 (Frucht 1952). If a group automorphism a € Aut(G) stabilizes the 
set SC G then a € Aut(I(G, S)). 


Corollary 3.6. (a) If S is an orbit of some subgroup H of Aut(G) then I(G,S) is 
edge-transitive. 

(b) If, in addition, S = §', then 1(G,S) is flag-transitive. 

(c) An edge-transitive Cayley graph of an abelian group is flag-transitive. 


Note that the added condition in (b) is automatically satisfied if S consists of 
involutions (elements of order 2). Frucht (1952) employed this observation to 
construct a flag-regular graph of degree 3. Another application is the construction 
of 2-arc-transitive covering graphs (Theorem 1.5). 


3.2. Symmetry and connectivity 


The implications of vertex-transitivity to connectivity properties of graphs were 
discovered by Mader (1971a,b) and Watkins (1970). Their methods and results 
were generalized to directed graphs by Hamidoune (cf. Hamidoune 1981). We 
state the directed graph versions; undirected graphs are viewed as digraphs with 
edgcs oriented both ways. We note that a weakly connected finite vertex-transilive 
digraph is automatically strongly connected so we may omit the adjective. The con- 
nectivity x(X) of a strongly connected digraph X ¥ K,, is the minimum number of 


vertices whose deletion destroys strong connectivity. Edge-connectivity is defined 
similarly. 


Theorem 3.7. Let X be a finite connected vertex-transitive digraph of out-degree d. 
(a) The connectivity of X is > [(d+1)/2]. If X is undirected then «(X) > 
{2(d + 1)/3}. 
(b) The edge-connectivity of X is d. 
(c) If X is edge-transitive or vertex-primitive, then x(X) = d. 


The bounds in part (a) are tight, as shown by the lexicographic product of a 
(directed or undirected) cycle of length m > 4 and K,. 
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All these results are simple consequences of the theory of atoms, developed by 
Mader, Watkins, and Hamidoune in the same papers (cf. chapter 2, section 7.5). 
A positive fragment of a strongly connected digraph X is a subset F C V(X) such 
that the set X*(F) of out-neighbors of F has cardinality «(X) and FU X+(F) # 
V(X) (so X*(F) is a minimum cutset). An positive atom is a positive fragment of 
minimum cardinality. 

The key result of Mader, Watkins, and Hamidoune is that if A is a positive 
atom and F is a positive fragment then either AC F or ANF =4. (For a simple 
proof, see Hamidoune 1981, Theorem 2.1.) In particular, the positive atoms are 
pairwise disjoint. Consequently, if X is vertex-transitive then the atoms form a 
system of imprimitivity. From this, the vertex-connectivity results readily follow. 
For the edge-connectivity result, edge-atoms are introduced and their disjointness 
proved. (Cf. also Lovasz 1979a, chapter 12 for these and related results.) 


Corollary 3.8 (Cauchy-Davenport). Let} # A,B C Z, (pa prime). Then |A + BY > 
min {p, |A[ + |B] - 1}. 


Proof. W.l.0.g., 0€ B. Apply part (c) of Theorem 3.7 to the vertex-primitive 
Cayley digraph X = I'(Z,, B \ {0}). Conclude that if A + B 4 Z, then |X*(A)| > 
K(X) = |B| — 1. Observe, on the other hand, that X'(A) =(A+B)\A. O 


For this result and other connections with additive number theory, see Hami- 
doune (1990). 

Minimal Cayley graphs do even better than guaranteed by part (a) of Theo- 
rem 3.7: If X is a minimal Cayley graph of degree d then x(X) = d (Godsil 1981a). 

Infinite connected vertex-transitive graphs of arbitrarily large degree may have 
connectivity as small as 1, as the example of the regular tree of any degree demon- 
strates. Yet, analogous results exist. Let «¢(X) denote the smallest size of a subset 
C of the vertex set of a locally finite infinite graph X such that at Icast one of 
the connected components of X \ C is finite. [f X is connected, vertex-transitive, 
and it has finite degree d, then «(X) > [3(d +1)/4] (Babai and Watkins 1980). 
Analogously to the finite case, the proof rests on the disjointness of atoms (finite 
sets of vertices with x, neighbors). We note that if the graph X has just one end 
(cf. section 3.7) then K(X) = x,(X). 


3.3. Matchings, independent sets, long cycles 


All graphs in this section are finite. The next question concerns matchings. 


Theorem 3.9 (Little, Grant and Holton 1975). Let X be a connected vertex-transitive 
graph on n vertices. 

(a) If n is even then X has a perfect matching. 

(b) If n is odd then X is matching critical. 

(c) (Lovasz, Plummer) /f n is even then X is either bicritical (deletion of any pair 
of vertices leaves a perfect matching) or elementary bipartite (deletion of any pair 
of vertices of opposite color leaves a perfect matching). 
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Proof. Let D be the set of those vertices of X which are left uncovered by at least 
one maximum matching. The Gallai-Edmonds structure theorem (see chapter 3, 
section 4.3) asserts that if D is not empty then it consists of matching critical 
components. But if X does not have a perfect matching then, by vertex-transitivity, 


D is the entire vertex set. This proves (a) and (b). For (c), see Lovdsz. and Plummer 
(1986, Theorem 5.5.24). 


The following two observations on regular uniform hypergraphs come in handy 
in the analysis of various kinds of subsets of vertex-transitive graphs. (A hyper- 
graph is regular of degree d if every vertex is contained in exactly d edges.) 


Lemma 3.10 (Regular hypergraph counting lemma). Let € and ¥ be r-uniform 
and s-uniform regular hypergraphs, resp., on the same set of n vertices. 

(a) Assume |E; 1 F;| > d for every E; € €, F; € ¥. Then rs > nd. 

(b) Assume [E; O F;| <d for every Ej € €, Fi) € ¥. Then rs < nd. 
If € = ¥ and d #£n then under condition (a) we have r* > nd. On the other hand, 
if € = ¥, dn, and \|E;NE,\ < d for every Ej, E; € €, i # j, then r? < 2nd. 


Proof. (a) Fix E; ¢ €. Count the number of pairs (x, j) such that x € E; N F;. The 


result is rdeg, > dj#| = dndeg.; /s. (b) This part, as well as the $ = & variants, 
follows analogously. 


As a corollary, we have a tradeoff between a(X), the maximum size of inde- 
pendent sets, and w(X), the maximum size of cliques of the graph X, for vertex- 
transitive graphs. For a generalization in the context of the Shannon capacity of 
graphs, see Lovasz (1979b) (cf. chapter 31, section 6). 


Corollary 3.11 (L. Lovasz and R. M. Wilson). If X is a vertex-transitive graph then 
a(X)w(X) <a. 


Proof. Indeed, let € and ¥ be the hypergraphs consisting of the independent sets 
and cliques, resp., of maximum size. Each of these two hypergraphs is uniform 
by definition; they are regular because X is vertex-transitive. Since a clique and 


an independent set share at most one vertex, the result follows from part (b) of 
Lemma 3.10. 


We note that Delsarte (1973) proves the same conclusion under the condition 
that X is the union of classes in an association scheme (cf. chapter 15), in partic- 
ular, if X is strongly regular. This condition does not imply the presence of any 
automorphisms, nor is it a consequence of vertex-transitivity. 

A related observation concerns the connection between the chromatic number 
x(G) and the independence number a(G). Clearly, x(G) > n/a(G) for all graphs. 
For vertex-transitive graphs, this inequality is nearly tight, as pointed out to us by 


M. Szegedy. 
Proposition 3.12. If G is a vertex-transitive graph then 


n/a(G) < x(G) < n(1+Inn)/a(G). 
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Proof (of the rightmost inequality). Let A be an independent set of size a = a(G); 
then the probability that m = [Inn] random translates of A (by automorphisms) 
do not cover V(G) is less thann-(1—a/n)"™<n-ee"" <1. 9 


Another corollary to Lemma 3.10, of interest to the theory of computing, con- 
cerns Boolean functions f:2* — {0,1}. Here, X is a set of n Boolean variables 
Xj,-++)Xp, and 2* represents the set of all possible truth value assignments to X. 
A partial truth value assignment y: Y — {0,1} (Y CX) is said to force f to 0 if 
f(x) = 0 whenever x is an extension of y. We call such a y a 0-certificate for f; 
its size is [Y|, the cardinality of the domain of y. We define L-certificates anal- 
ogously. For every x € 2* there exists a smallest restriction y of x which is an 
f(x)-certificate; let m(f;x) denote its size, and let n;(f) = max, m(f;x) where the 
maximum is taken over all x with f(x) = i (§ = 0,1). Let N(f) = max {no(f), 1 (f/)}- 
The quantity N(/f) is called the nondeterministic decision-tree complexity of f. (This 
is a lower bound on the deterministic decision-tree complexity discussed in chapter 
34, section 4.4. Incidentally, the “evasiveness” problems considered there relate 
symmetry to complexity in a remarkable way.) 

Automorphisms of f are those permutations of X which leave f invariant. 


Corollary 3.13. If f is anon-constant Boolean function on n variables with transitive 
automorphism group then no(f)n,(f) > n. Consequently, N(f) > Jn. 


Proof. The domains of a 0-certificate and a 1-certificate must intersect. One can 
thus apply Lemma 3.10 (a) to the hypergraphs formed by an orbit of cach kind of 
domain. O 


Our next subject is Jong paths and cycles. Only four connected vertex-transitive 
graphs without Hamilton cycles are known (assuming the number of vertices is n > 
3). Each of them is trivalent; and the first two are 3-arc-transitive (cf. section 5.1): 
the Petersen graph (10 vertices), and the Coxeter graph (28 vertices) (see figures 8, 
9 in chapter 1, section 4). (The automorphism group of the latter is PGL(2, 7), see 
Wong (1967); cf. Biggs 1973). The other two are obtained from these by replacing 
each vertex by a triangle (30 and 84 vertices, resp.). Each of these four graphs 
possesses a Hamilton path and none of them is a Cayley graph. A conjecture of 
Lovasz (in 1969) not shared by this author holds that all connected vertex-transitive 
graphs have Hamilton paths. The problem as to whether all Cayley graphs ( > 3) 
have Hamilton cycles appears first to have been stated by Rapaport-Strasser (1959). 
In my view these beliefs only reflect that Hamiltonicity obstacles are not well 
understood; and indeed, vertex-transitive graphs may provide a testing ground 
for the power of such obstacles. We conjecture that for some c > 0, there exist 
infinitely many connected vertex-transitive graphs (even Cayley graphs) without 
cycles of length > (1 —c)n. 

We mention a useful Hamiltonicity obstacle. A graph is tough if, after dele- 
tion of any k of its vertices, the remaining graph has < k connected components. 
Obviously, any Hamiltonian graph is tough; being non-tough is a Hamiltonicity 
obstacle. This obstacle breaks down for vertex-transitive graphs: every connected 
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vertex-transitive graph is tough. Indeed, by Theorem 3.7(b), a d-regular connected 
vertex-transitive graph has edge-connectivity d, a circumstance that immediately 
implies toughness. 

At any rate, the Hamiltonicity conjectures have been confirmed in a number of 
cases. One notable Hamilton cycle was even patented in 1953: the one constructed 
by FE. Gray for the minimal Cayley graphs of the elementary abelian groups of order 
24 (the d-cube). A large number of papers has since referred to Hamilton cycles 


in Cayley graphs as “generalized Gray codes”. (See the references in Conway et 
al. 1989.) 


In the subsequent statements, every graph has > 3 vertices. 

It is easy to see that every Cayley graph of a finite abelian group is Hamiltonian 
(J. Pelikan, see Lovasz 1979a, Ex. 12.17). Marugié, Witte, Keating, Diirnberger, 
and others succeeded in significantly relaxing the condition of commutativity. We 
refer to the survey by Witte and Gallian (1984) for details. One of the weakest 
known sufficient conditions for all Cayley-graphs of G to be Hamiltonian is that the 
commutator subgroup of G is cyclic of prime power order (Keating and Witte 1985; 
cf. Diirnberger 1985). Witte (1986) proved that all Cayley digraphs of a p-group 
have a Hamilton cycle. 

So far, no non-solvable group has been shown to have this property. Even the 
following, less ambitious problem is open: does every finite group have a minimal 
Cayley graph with a Hamilton cycle? 

For several reasons (including, as a curiosity, campanology, the study of bell ring- 
ing sequences, sce White 1985), special classes of Cayley graphs of the symmetric 
groups are of interest. 


Theorem 3.14 (Kompel’macher and Liskovets 1975). Let T be any connected sys- 
tem of transpositions of n elements. Then the Cayley graph 1(S,, T) is Hamiltonian. 


The case of adjacent transpositions (see Johnson 1963) was recently generalized 
to all finite reflection groups (groups of affine transformations of R”, generated by 
a set of reflections) (Conway, Sloane and Wilks 1989). 

The situation for Cayley digraphs is more complicated. Rankin (1948) deter- 
mined when a Cayley digraph /'(G, S) of a finite abelian group G is Hamiltonian 
provided |S| = 2 and gave examples of Cayley digraphs of the alternating groups 
As and A7 without Hamilton cycles. J. Milnor gave a class of solvable groups 
with two generators such that the difference between the order of the group and 
the longest directed paths in the resulting Cayley digraphs is arbitrarily large (see 
Witte and Gallian 1984). 

Much less is known about vertex-transitive graphs of given order. Trivially, ev- 
ery connected vertex-transitive graph of prime order p is a Cayley graph and is 
therefore Hamiltonian. Maru&ié’s result (Theorem 3.4(b)) extends this to orders 
p* and p’. 


Theorem 3.15. (a) Every connected vertex-transitive graph of order n is Hamilto- 
nian if n has one of the following forms: p, 2p, with the exception of the Petersen 
graph (Alspach 1979); 3p, p*, p>, 2p? (Marusié 1985, 1987); 
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(b) Every connected vertex-transitive graph of order 4p and Sp has a Hamilton 
path (Maru&ié and Parsons 1983). 


The only general lower bound on the length of the longest cycles and paths 


of vertex-transitive graphs is the following. (Nothing better is known for Cayley 
graphs either.) 


Proposition 3.16 (Babai 1979b). If X is a connected vertex-transitive graph on n > 
5 vertices then X has a cycle of length > 2,/n. 


We note that 3-connected trivalent graphs have cycles of length > n® (Jack- 
son 1986) but need not have cycles longer than n° (Bondy and Simonovits 1980). 
The proof of Proposition 3.16 is based on the following observation: If X is a 
3-connected regular graph of order n > 4 then every pair of longest cycles inter- 
sects in > 4 vertices. Now, since every connected vertex-transitive graph of degree 
> 3 is 3-connected (Theorem 3.7(a)), an application of the Regular hypergraph 


counting lemma 3.10 to the vertex sets of the longest cycles completes the proof 
of Proposition 3.16. 


3.4. Subgraphs, chromatic number 


Every graph Y with n vertices is an induced subgraph of some Cayley graph X 
of any given group of order 2 cn’ (Babai 1978b, Babai and Sés 1985, Godsil and 
Imrich 1987). Every Y can be embedded into a Cayley graph of order 2” such 
that all automorphisms of Y extend to automorphisms of X. The following more 
general extension theorem holds. 


Theorem 3.17 (Hrushovski 1992). Given a finite graph Y and a family ¥ of iso- 
morphisms between pairs of induced subgraphs of Y, there exists a finite graph 
X containing Y as an induced subgraph, such that all elements of & extend to 
automorphisms of X. 


Here, X may be required to be flag-transitive. This result has applications to the 
structure theory of the automorphism group of the Rado graph (countable “ran- 
dom graph”, cf. section 5.3). It follows that “almost all” automorphisms of that 
graph are conjugates (“almost all” in the sense “comeager” (complement of a set 
of first Baire category; cf. Oxtoby 1980): there exists a comeager conjugacy class; 
cf. Truss 1992). 

Not all graphs are subgraphs of minimal Cayley graphs. Let X = F'(G,S) be a 
minimal or semiminimal Cayley graph (cf. section 3.1) of the (finite or infinite) 
group G. Such graphs admit a coloring of the edges with the following properties: 
(a) every vertex has degree < 2 in each color; (b) at least one of the colors occurring 
in a cycle occurs at least twice on that cycle. (In the minimal case, each color 
occurring in a cycle occurs at least twice.) 

These properties put constraints on the possible subgraphs. In particular, if X is 
a minimal Cayley graph then it contains no K, (K4 minus an edge), and no K23. 
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If X is semiminimal, it contains no Ks; ,7 (Babai 1978a). In both cases it follows 
that the chromatic number of X is at most countably infinite, according to the 
following result of Erdés and Hajnal (see chapter 42, Theorem 6.3.): If a graph 
has uncountably infinite chromatic number then it contains Kx, for every positive 
integer m. : 

It is an open problem whether or not the chromatic number of finite minimal 
Cayley graphs is bounded. We conjecture it is not. A related stronger conjecture is 
that for every ¢ > 0 there exist minimal Cayley graphs X such that a(X) < «|V(X)| 
where a(X) denotes the size of the largest independent set of X. 

A strong consequence of constraints (a) and (b) above was deduced by Spencer. 


Theorem 3.18 (Spencer 1983). For every g > 3 there exists a finite graph Y of girth 
g such that Y is not a subgraph of any (semi)minimal Cayley graph. 


The proof uses the probabilistic method and does not provide explicit graphs 
Y. It is not known whether or not such excluded subgraphs of girth 5 and degree 
3 exist, even for minimal Cayley graphs. (The Petersen graph is a subgraph of a 
minimal Cayley graph of a group of order 20.) 

Every finite group has a Cayley graph of chromatic number < 4. (This is a con- 
sequence of the fact that every finite simple group is generated by < 2 elements.) 
It is an open question whether or not every infinite group has a Cayley graph of 
finite chromatic number. 


3.5. Neighborhoods, clumps, Gallai-Aschbacher decomposition 


In this section we highlight a graph theoretic result that has played a role in the 
classification theory of finite simple groups. 

We shall (in this section) consider finite graphs as well as locally finite infinite 
graphs with uniformly bounded degrees. X will always denote a graph with vertex 
set V and complement X; the set of neighbors of x € V is denoted by X(v). The 
subgraph induced by X(v) is the link at v. We say that X has constant link Y if 
all of its links are isomorphic to Y (a finite graph by the convention above). All 
vertex-transitive graphs have constant link, and many others, including all triangle- 
free regular graphs and their line graphs. A finile graph Y is a link graph if there 
exists a graph X with constant link Y. If such a finite X exists then Y is a link of 
finite type. 

Many classes of link-graphs as well as non-link graphs have been found (see Hell 
1978, Blass et al. 1980). However, Bulitko (1972) asserts that the problem whether 
or not a given finite graph is a link graph is undecidable. It is shown in Bulitko 
(1972) and Brown and Connelly (1975) that there exist graphs which are links of 
infinite vertex-transitive graphs but do not occur as links in finite constant-link 
graphs. By counting certain triangles, Blass et al. (1980) show that if L is the link 
in a finite vertex-transitive graph and i is an odd number then the number of vertices 
of degree i in L is even. This is not true for all link graphs of finite type (Brown 
and Connelly 1975 provides an infinity of examples), nor does it hold for links of 
infinite vertex-transitive graphs. Hell (1978) observes that if a (finite or infinite) 
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vertex-transitive graph X has an asymmetric link then X is a Cayley graph (in fact 
a GRR, cf. section 4.3). He also shows that the link of a Cayley graph has an even 
number of vertices of degree one. 


The following fairly general result is implicit in Aschbacher (1976). 


Theorem 3.19. Assume both X and, X are connected. Then at least one of the links 
of X, say Y, has the property that Y has a unique largest connected component. 


A stronger result holds for vertex-primitive graphs. 


Theorem 3.20 (Aschbacher, Fischer). Let X be a vertex-primitive graph other than 
the complete graph. Let Y be the graph induced by the neighborhood of a vertex 
in X. Then the complement of Y is connected. 


The proof of these theorems rests on a purely graph theoretical result, part of 
which was discovered by Gallai (1971) in the context of the characterization of 
transitively orientable graphs. 

A subset C C V is called a clump if for each w € V \C, if w has a neighbor 
in C then X(w) DC. The trivial clumps are V, 0, and the singletons. A proper 
clump is a clump other than V. A maximal clump is a proper clump not properly 
contained in any other proper clump. 

We begin with two easy observations. (a) If C,D are clumps and COND 490 
then C UD is a clump. (b) If both X and its complement X are connected then 
V is not the union of two proper clumps. It is immediate from these that maximal 
clumps are pairwise disjoint, which proves part (i) of the following result. 


Theorem 3.21 (Gallai-Aschbacher decomposition). Assume both X and X are 
connected. Let C\,...,Cm be the maximal clumps of X. Then: 

(i) (Gallai 1971, Aschbacher 1976) (C,,...,C,,.) form a partition of V. 

(ii) (Aschbacher 1976) Let N; be the set of common neighbors of C;. (By defini- 
tion, C; AN; =.) Then there exists i such that the subgraph induced by N;, in the 
complement of X is connected. , 


To see how Theorem 3.20 follows, we observe that the maximal clumps form a 
system of imprimitivity for Aut(X); therefore if X is vertex-primitive then each C; 
is a Singleton. 

The proof of assertion (ii) is nontrivial. For v € V, consider the components 
of the subgraph of X induced by X(v). Let M be a maximal such component 
(considering all v € V). By definition, M induces a connected subgraph of X. Let 
C be the set of common neighbors of M. One can prove that C is a maximal 
clump, and M is the set of common neighbors of C. This completes the proof 
of Theorem 3.21; and together with (i) above we also see that the choice of M 
among the components of X(v) in X must be unique (since any other component 
is a subset of C, the unique maximal clump containing v). 


Gallai (1971) gives the following equivalent definition of the above decompo- 
sition. Let us say that two edges are equivalent if they together form an induced 
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path of length 2. Take the transitive-closure of this relation to obtain the Gallai 
equivalence. If both X and X are connected, then there will be a unique Gallai 
class of edges which spans the entire X. The components of the complement of this 
class can be grouped together in a unique way to produce the maximal clumps; two 
such components will belong to the same class if they. have the same neighborhood 
in X. 

The role of Theorem 3.20 in the classification of finite simple groups is explained 
by Aschbacher (1976). He shows how Fischer’s (1971) celebrated “3-transpositions 
theorem” follows from it; in fact, the result arose from one of Fischer’s lemmas. A 
set of 3-transpositions is a set § of elements of order 2 in a group G such that for 
any pair g,h € S, the order of gh is < 3. Fischer characterized those almost simple 
groups which are generated by a conjugacy class of 3-transpositions. These include 
all the symmetric groups, certain classical (symplectic, orthogonal, unitary) groups, 
plus three sporadic groups discovered in the process (named M (22), M(23), and 
M (24)). (M (24) is not simple; like the symmetric groups, it has a simple subgroup 
of index 2. Cf. Aschbacher 1980.) 

Fischer’s central result was that the action of G by conjugation on S is a rank 
3 permutation group. This is derived from considering the vertex-transitive graph 
with vertex sct S, joining two elements if they commute. G is shown to act as a 
primitive group on this graph; and Theorem 3.20 is invoked. © 


Godsil (1980b) considers the link L of X together with the link L* of X, the 
dual link. He gives the following remarkable characterization: if X is finite, vertex- 
transitive, both the link L and the dual link L* are disconnected but at least one 
of them has no isolated vertices, then X = L(K33). He also characterizes the case 
when both L and L* have isolated vertices. In this latter case, Aut(X) always has 
an element of the form (12)(34). These results are central to his solution of the 
GRR problem (cf. section 4.3). 


3.6. Rate of growth 


Note. Throughout this section, X will denote an infinite, connected, locally finite 
graph. (A graph is locally finite if all vertices have finite degree.) 


Certain properties of groups are best expressed in graph theory language. A 
foremost example is the growth rate of finitely generated infinite groups. 

For a graph X, let B(n, x) denote the ball of radius n about the vertex x, i.e. set of 
vertices at distance < n from x. For a vertex-transitive graph, set f(n) = |B(n, x)]- 
This function has a property resembling log-concavity. 


Proposition 3.22 (Gromov 1981). If X is vertex-transitive, then f(n)f(Sn) < (f(4n))’. 


Proof. Let Y be a maximal system of vertices in B(3n,x) pairwise at distance 
2 2n +1. Now the disjoint balls B(n, y): y € Y are contained in B(4n, x), hence 
\Y |f(m) < f(4n). On the other hand, the balls B(2n,y): y € Y cover B(3n, x), and 


therefore the balls B(4n, y): y € Y cover B(5n, x). This implies f(5n) < |Y|f(4n), 
hence the result. 1 
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X is said to have growth rate g(n) if g(cin) < f(n) < g(con) for positive constants 
C1, C2 and every sufficiently large n. Thus, the growth rate is an equivalence class of 
functions rather than a function. There is a natural partial order on the equivalence 
classes; when comparing growth rates, we shall always mean comparison of their 
equivalence classes. 

X is said to have polynomial growth rate if its growth rate is bounded by n‘ for 
some constant c; its growth rate is exponential if it is bounded from below by c’ 
for some constant c > 1. 

For a finitely generated infinite group G, the ties rate of G is defined as the 
growth rate of the Cayley graph 1'(G,S) for some finite set S of generators of G. 
It is easy to see that the growth rate does not depend on the particular choice of 
S; a change in the generators will only affect the constants c, and cp. 

Finitely generated abelian groups have polynomial growth rates, non-cyclic free 
groups have exponential growth rates. The following are easy to prove. 


Proposition 3,23. (a) If H is a subgroup of G, then the growth rate of G is greater 
than or equal to the growth rate of H. 


(b) If |G: A is finite then G and H have the same growth rates. 

(c) (Gromov 1981) If #7 is finitely generated and |G:HA| is infinite then f;;(n) > 
nfj,(n), where f,;; and fj); are the growth functions of the respective groups under 
appropriately chosen sets of generators. 


(d) (Milnor 1968a, Wolf 1968) Finitely generated nilpotent groups have polyno- 
mial growth rates. 


The Bass—Wolf formula gives the exact growth rates of nilpotent groups. Let G 
be a finitely generated infinite nilpotent group and let G = G, > G2 >--- > Gn = 
1 be its descending central series. Let d; be the torsion-free rank of the abelian 
group G;/Gi,. 


Theorem 3.24 (Bass 1972, Wolf 1968). The. rate of growth of the nilpotent group G 
is n* where d = 7 id;. 


The following very deep result settles a problem raised by Milnor (1968a). 


Theorem 3.25 (Gromov 1981). A group has polynomial growth rate if and only if 
it is virtually nilpotent, i.e. it has a nilpotent subgroup of finite index. 


Two important particular cases of this result were established earlier; they are 
ingredients in Gromov’s proof. 


Theorem 3.26 (Milnor 1968a, Wolf 1968, Tits 1972). (a) A finitely generated solv- 
able group G has exponential growth unless G is virtually nilpotent. 


(b) A finitely generated subgroup G of a connected Lie group has exponential 
growth unless it is virtually nilpotent. 


In fact, Tits proves the following stronger statement. 
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Theorem 3.27 (Tits 1972). If L is a Lie group with finitely many components and 
G is a finitely generated subgroup of L then either: 
(a) G contains a free group of rank 2 and has therefore exponential growth; or 
(b) G is virtually solvable. In this case it has exponential growth rate unless it is 
virtually nilpotent. 


We give a very rough sketch of the proof of Gromov’s theorem 3,25. Let G be 
a finitely generated group of polynomial growth. Fix a finite set S of generators. 
Select a sequence r; — oo of integers. Consider the sequence of metric spaces I; 
on the set G with distance d;(x, y) = (1/r;)dist(x, y) where “dist” is the distance in 
the Cayley graph !'(G,S). The sequence r; is chosen so as to ensure a fairly regular 
behavior of the sequence f(2/r;), i = —j,..., j. This is accomplished with the aid of 
Proposition 3.22 and using the assumption of polynomial growth. The sequence I; 
is then nice enough to have a subsequence that converges in an appropriate sense 
to a metric space Y. Elementary considerations show that Y is locally compact, 
connected, and locally connected. Moreover, each ball in Y is path-connected. The 
isometry group L of Y is transitive on Y. The choice of the 7; ensures that the 
Hausdorff dimension of Y is finite. A celebrated theorem of Montgomery and 
Zippin (1955) now implies that under these conditions, L is a Lie group with a 
finite number of components. Now a fairly involved argument using the quoted 
result of Tits (Theorem 3.27(b)) completes the proof. © 


Other ingredients of this last part of the proof are the Milnor—Wolf theorem 
(Theorem 3.26(a)) and the following theorem of Jordan (cf. Raghunathan 1972). 


Theorem 3.28 (Jordan 1895). If L is a Lie group with a finite number of compo- 
nents then there exists a number q such that every finite subgroup of L contains an 
abelian subgroup of index at most q. 


An appendix to Gromov’s paper contains a relatively simple proof of the subcase 
of the Milnor—-Wolf theorem used in Gromov’s proof. 
Milnor (1968a) raised the question whether groups with “intermediate growth 


rates” (neither polynomial, nor exponential) exist. The positive answer was given 
by Grigorchuk. 


Theorem 3.29 (Grigorchuk 1983). There exist 2-generated torsion groups with 
growth rates between 2"" and 2" where a = 4 — & for any € > 0 and B = logy 31. 


Vertex-transitive graphs with polynomial growth rates were characterized by V.I. 
Trofimov. 


Theorem 3.30 (Trofimov 1985). Let X be vertex-transitive. The following are equiv- 
alent. 


(a) X has polynomial growth. 

(b) The vertex set V under the action of Aut(X) admits a system of imprimitivity 
o with finite equivalence classes such that Aut(X/c) is finitely generated, virtually 
nilpotent, and the stabilizer of any vertex of X/o in Aut(X/o) is finite. 
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Here X/o is the homomorphic image of X under the vertex map V(X) > 
V(X)/o; hence two equivalence classes are adjacent if they have at least one pair 
of adjacent representatives. 

Related topics are surveyed in Trofimov (1992). 

We should mention that these questions were originally motivated by connec- 
tions between the curvature of a Riemannian manifold and the growth rate of its 
fundamental group (Milnor 1968b). 


3.7. Ends 


Note. Throughout this section, X will denote an infinite, connected, locally finite 
graph. (A graph is locally finite if all its vertices have finite degree.) 


Ends are another important graphic notion for finitely generated infinite groups. 
(For a detailed account, see Cohen 1972.) 

The set of ends of a connected, locally connected, locally compact Hausdorff 
space X is defined as the inverse limit of the directed family of the set of compo- 
nents of X \ C for all compact subsets C (Hopf 1944). The analogous concept for 
connected graphs was developed by Halin (1964). 

Ends of a (connected, infinite, locally finite) graph X can be defined analogously 
as the inverse limit of the sets of infinite components obtained by deleting finite 
subsets C of the edge set of the graph X. The ends can also be defined as equiva- 
lence classes of one-way infinite paths: two such paths are equivalent if the deletion 
of no finite set of edges separates their infinite components. 

The ends of a finitely generated group are defined as the ends of its Cayley 
graphs. Different choices of finite sets of generators result in topologically equiv- 
alent sets of ends. Stallings (1971) contains important results on ends of groups. 


Proposition 3.31 (Hopf 1944). If X is vertex-transitive then it has \ or 2 or infinitely 
many ends. In particular, the same holds for finitely generated infinite groups. Con- 
sequently, if X has more than 2 ends then it has exponential growth rate. 


A vertex-transilive graph has two ends if and only if it has linear growth rate. 
Groups with two ends have been fully characterized. 


Theorem 3.32 (Freudenthal 1945). A finitely generated infinite group G has two 
ends if and only if G has a finite normal subgroup N such that the quotient group 
G/N is either cyclic (Z) or dihedral (the free product Z * Z2). 


Groups with infinitely many ends have also been characterized. We note that 
they have exponential growth rates; the converse is false. Let A be a group, F a 
subgroup, and g:F — A an injection. The HNN-extension G = (A, F, ¢) is gen- 
erated by A and an additional element x subject to the relations x~'fx = g(f) 


Cf € F). 


Theorem 3.33 (Stallings 1971). A finitely generated group G has infinitely many 
ends if and only if G is 


Automorphism groups, isomorphism, reconstruction 1481 


(a) either a free product with amalgamated finite subgroup G = G, +; G2, where 
F is a finite proper subgroup of each G; and has index 2 3 in at least one of them; 


(b) or an HNN-extension G = (A,F, 9), where F is a finite proper subgroup of 
A. 


These cases are closely related to group actions on trees. A theory of such 
actions was developed by Bass and by Serre (1980). We quote two special cases. 

The group G is said to act without inversions on a graph if no element of G 
inverts any edges. In other words, G preserves an orientation of the graph. 


Theorem 3.34 (Serre 1980, chapter 4). (a) Let G act edge-transitively but not 

vertex-transitively on a tree T. Let P,Q be two adjacent vertices of X. Then G 

is the free product of the stabilizers of P and Q amalgamated at their intersection. 
(b) Every amalgam of two groups acts on a tree in this way. 


Theorem 3.35 (Serre 1980, chapter 5.4). Let G act edge-transitively and vertex- 
transitively but without inversions on a tree T. Then G is an HNN-extension of 
the stabilizer of a vertex. Every HN N-extension acts on a tree in this way. 


For the general structure theorem, see section 3.11. 
M.J. Dunwoody used Theorems 3.34 and 3.35 to give the following remarkable 
generalization of the Stallings characterization theorem (Theorem 3.33). 


Theorem 3.36 (Dunwoody 1982). Let G be a group acting on a connected graph 
X with > 2 ends. Then G is either an amalgam G = A *¢ B or an HNN-extension 
of a group C, where in each case C contains the stabilizer of two adjacent vertices 
as a subgroup of finite index. 


The proof is based on the construction of a tree T on which G acts without 
inversions and so that the quotient graph has a single edge. A key tool for the 
construction of T is the following surprisingly strong statement on the existence 
of cuts of very special kind. 


Theorem 3.37 (Dunwoody 1982). Let X = (V,E) be a connected graph with 22 
ends. Then there exists a nonempty proper subset A C V such that 

(a) the set of edges between A and V \ A is finite; 

(b) for any g € G, either A or V \ A is included in either A® or in (V \ A)’. 


In this result, the graph X is not required to be locally finite. 


3.8. Isoperimetry, random walks, diameter 


The boundary of a subset U of the vertex set V of the graph X is the set OU 
of vertices in V \ U, adjacent to at least one vertex in U. The isoperimetric ratio 
of a set W C V is defined as e(W) = |O(W)|/|W|. We say that W is e-expanding 
if e(U) > e for every U C W (U #98). We call X an e-expander if every subset 
W CV with 1 <|W| < |V|/2 is e-expanding. A “family of linear expanders” is an 
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infinite sequence of graphs of bounded degree which are e-expanding for some 
fixed ¢ > 0. (“Linear refers to the O(V) bound on the number of edges.) Ex- 
panders are treated in detail in chapter 32. Some Cayley graphs of lincar groups 
turn out to be particularly strong expanders (Lubotzky, Phillips and Sarnak 1988, 
Margulis 1988). 

Here we shall focus on more modest expansion properties shared by all vertex- 
transitive graphs. The generality of the results is important in applications to the 
analysis of algorithms in groups (cf. Babai 199fa). 

It is easy to see that that the diameter of an e-expander on n vertices can be 
bounded as 


diam(X) < Inn/tn(l +e). (4) 


For ¢ < 4, we infer ¢ < (3)(Inn/diam(X)). It is remarkable that for vertex- 
transitive graphs, this inequality is tight apart (rom an Inn factor. 


Theorem 3.38 (Aldous 1987, Babai 1991b, Babai and Szegedy 1992). If X is a vertex- 
transitive graph of diameter A then it is a 2/{(2A4+ 1)-expander. 


Aldous’ proof (for Cayley graphs) is based on the following observation, due to 
Erdés and Rényi (1965) (cf. Babai and Erdés 1982), which we quote in a slightly 
generalized form (Babai and Sés 1985, Cooperman, Finkelstein and Sarawagi 
1990). 


Proposition 3.39. Let G be a transitive group acting ona set V,|V| =n. Let ACV. 
Then 


(U/IG]) SIAN AB = |AP/n. (5) 


BEG 


(A set and its translates are “independent on average”.) 

It follows by greedy selection that G has a transitive subgroup generated by at 
most log, n + log, Ina +1 elements (Babai and Sdés 1985), and a set of O(logn) 
random elements are likely to generate a transitive subgroup (Cooperman et al. 
1990), a fact with implications to efficient manipulation of permutation groups (cf. 
Babai, Cooperman, Finkelstein and Scress 1991). 

Let now X = I'(G,S) where § =S~' generates G and assume diam(X) = A. 
If |A] < |G|/2 then by (5) there exists g € G such that [A \ gA| 2 |A|/2. Aldous 
observes that g = hy ---}, for some k < A from which one concludes that |h;A \ 


A| > |A|/(2A) for some some fh, € S, proving a 1/(24) lower bound for Theorem 
3.38. 0 


By Alon’s (1986) theorem (see chapter 32, Theorem 3.2) it follows that for 
vertex-transitive graphs X of degree d and diameter A, the eigenvalue gap is d — 
Ay > 1/(24+ 2)*, where A is the second largest eigenvalue of X. 

This eigenvalue gap is significant in estimating the speed at which a random 
walk over X approaches the uniform distribution. Let us consider a lazy random 


Automorphism groups, isomorphism, reconstruction 1483 


walk on X, in which at every step we flip a coin; if it comes out heads, we do 
not move, else we move to an adjacent vertex, each neighbor having equal chance 
of being visited. (This trick climinates potentially annoying negative eigenvalues 
from the matrix of transition probabilities.) A direct consequence of the foregoing 
considerations is the following rapid convergence (Aldous 1987, Babai, 1991b). 


Corollary 3.40. Let vo, v, be vertices of a vertex-transitive graph of degree d and 
diameter A with n vertices. After € steps, the lazy random walk, Starting at ug, will 
be at v, with probability ({/n)(\ + €), where 


e <nexp(-0/(8d - (A+2))). 2 “ 


In particular, if both d and A are bounded by (logn)("), then so is the time it 
takes for the lazy random walk to arrive at a nearly uniformly distributed place. 

For specific Cayley graphs (related, e.g., to card shuffling), different methods 
have been used to obtain strong estimates on the time it takes to reach near 
uniformity (Aldous 1983, Aldous and Diaconis 1987, Diaconis 1988). 

While results of this kind necessarily require the graph to have small diameter 
(cf. (4)), vertex-transitive graphs with large diameter, including infinite graphs, also 
possess a similar local expansion property. 


Theorem 3.41 (Local expansion, Babai 1991b, Babai and Szegedy 1992). Let X 
be a connected (finite or infinite) vertex-transitive graph with vertex set V. Assume 
that the finite subset U C V is within the ball of radius t about some vertex; and 
|U| <|V|/2. Then U is a 2/(2t + 1)-expanding set. 


When X = I(G,S) is a Cayley graph, again a single generator is responsible: 
{Ug \ U| > |U|/(4t) for some generator g € S (Babai 1991b). 

This result is a tool in the rigorous analysis of efficient algorithms for permu- 
tation groups (Babai, Cooperman, Finkelstein and Seress 1991). A further con- 
sequence is that in vertex-transitive graphs, random walks do not get stuck in a 
corner for too long. In the theorem below, X*(v) denotes the ball of radius k about 
vertex v, and we consider how soon a random waik, starting at v, may be expected 
to be outside this ball. 


Theorem 3.42 (Babai 1991b). Let v be the start vertex of a random walk over a 
connected vertex-transitive graph X of finite degree d. Assume |X**(v)| < |V|/2. 


Let € > ck’d-In|X**(v)|. Then with probability > 4, at a random time chosen 


uniformly from {1,2,...,€}, the random walk will be outside X* (v). (c is an ap- 
propriate constant.) 


This result is at the heart of an algorithm which, given a set of generators of 
a finite group G, constructs nearly uniformly generated random elements of G in 
O(|log(G)|5) group operations (Babai 1991b). Reducing the exponent 5 would be 
of great significance, since many algorithms in computational group theory rely on 
“randomly chosen” elements from the group (see, e.g., Neumann and Praeger 1992, 
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Beals and Babai 1993). The fast heuristics currently used to select such elements 
have as yet resisted rigorous analysis (cf. Celler et al. 1995). 

Random walks over locally finite infinite graphs such as the d-dimensional grid 
have been of great interest for their many applications which include approxima- 
tions for partial differential equations and curvature of Riemannian manifolds (see 
Kesten 1959 and the references in Thomassen 1990, Markvorsen et al. 1994). One 
of the basic qualitative properties of such graphs is whether they are recurrent 
(random walks return to their start with probability 1) or transient (with positive 
probability they never return). A classical result of Polya (in 1921) (see Feller 1968, 
vol. 1, 14.7) asserts that Z? is recurrent, while Z* is transient. For connections of 
this theory with electrical currents, see Doyle and Snell (1984), Thomassen (1990). 
Expansion properties play a critical role in determining transicnce; if for some 
fixed « > 0 we have |0U| > |U|'/?** for every finite U C V then the graph is tran- 
sient (Varopoutos 1985, 1991). This result is Gght in the sense that ¢ = 0 would 
not suffice, as the plane grid Z* shows. 

For Cayley graphs of finitely generated groups, transience/recurrence does not 
depend on the choice of the set of generators. Transience is inherited by subgroups 
of finite index; recurrence is inherited by all subgroups. 

Thomassen and Woess (1994) survey a large body of literature on related topics. 

Our next subject is the diameter of Cayley graphs (cf. Babai et al. 1990 for more 
references). A regular graph of degree r > 2 and diameter d has at most 


n<ltrgr(r la tr(r If 1 = Leer — 1) -1)/r — 2) 


vertices, hence d > log,. ,(#/3). The construction of Cayley graphs of given de- 
gree and small diameter is motivated, among others, by interconnection net- 
work design for parallel computer architectures. Bounds on the diameter with 
respect to given generators are relevant for puzzles like Rubik’s cube: in this 
case, the question is the diameter of a specific Cayley graph of a group of order 
43 252 003 274 489 856 000 with respect to a set of 12 generators. (The diameter is 
known to be 2 19 and a rigorous almost certain probabilistic proof exists that it is 
no more than 36 (Fiat et al. 1989),) ; 

As noted above, expanders have diameter O(log). For its simplicity and small 
diameter, interconnection network designers favor a Cayley graph which is not 
an expander: the cube-connected cycles. (Cf. Leighton 1992.) This is the Cayley 
graph of the group Z 0Z, (of order nm = 52"), with generators 7,p where 7+ is an 
involution from the first copy of Z), and p is a rotation of order s, permuting the s 
copies of Z). The vertices can be represented by (0, 1)-strings of length s with one 
position marked. The neighbors of such a marked string are obtained by switching 
the marked symbol, or moving the mark left or right by one position, viewing the 
rightmost and leftmost positions adjacent. The graph has degree 3 and diameter 
[5s/2| —2, whereas log, n = 5 + log, s. 

We note that not all groups of order n with k generators admit Cayley graphs 
of degree O(k) and diameter O(fogn). Groups with a bounded number of gener- 
ators and a nilpotent subgroup of bounded index and bounded class of nilpotence 
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require diameter n° by Proposition 3.23(b) and (d). Using commutator collection, 
Annexstein and Baumslag obtained the following explicit value. 


Theorem 3.43 (Annexstein and Baumslag 1989). Let G be a group of order n with 


a nilpotent subgroup of index t and class &. If S is a set of k generators of G then 
the diameter of U'(G,S) is at least (n/t)‘, where 


c= (kre)~*/2. 
(For abelian subgroups, ¢ = 1.) 


Theorem 3.44 (Babai, Kantor and Lubotzky 1989). Every nonabelian finite simple 
group G of order n has a set S of at most 7 generators such that the diameter of 
1(G, S) is < Clogn for some absolute constant. 


The Cayley graphs constructed in the proof are unlikely to be expanders; it is 
not known whether an expander family of bounded degree Cayley graphs of the 
alternating groups exists, for instance. They have the advantage, however, that, 
given an element of G in the natural (matrix) representation of G, there is an 
efficient algorithm (polynomial in log) to solve this “generalized Rubik’s cube” 
puzzle, i.e. to compute a path of length O(logn) to the identity. (The explicit 
expanders mentioned in chapter 32 give no clue, how to find such a short path.) 
As an illustration, we describe the solution for the case G = SL(2, p). With the 
generators 


a(t} (8) 


we obtain expanders but no explicit routing. Instead, we choose the generators 


e(i1) (52) 


and A. It is easy to see that A and C rapidly generate all strict upper triangular 
matrices because 


oy t}e=(4 * ): 
01 0 1 


Conjugating by D we obtain transposes. 

Kantor (1992) proves that for n > 10, the groups PSL(n, qg) have trivalent Cayley 
graphs of logarithmic diameter O(n? log q). 

A particularly elegant construction of a Cayley graph of the symmetric group S,,, 
having diameter < 6.75” log, n, was given by Quisquater (1986) (cf. Babai, Hetyci, 
Kantor, Lubotzky and Seress 1990). 

If we admit a logarithmic number of generators, the situation becomes favorable 
for every group. The following result is a consequence of Proposition 3.39 (cf. Babai 
and Erdés 1982 for a short proof). 
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Theorem 3.45 (Erd6s and Rényi 1965). Given a group G of order n, there exists a 


set of k < log, n+ log, Inn +1 elements S = {gy,.-.., 8, € G} such that every x €G 
is representable in the form 


x= gi'gy---e;', where e; € {0,1}. 
In particular, the diameter of I'((G,S) is < k. 


In estimating the diameter of a Cayley graph, one faces much greater difficulties 
if the generators are prescribed. We conjecture, that for every finite simple group G 
and every set S of generators, the diameter of ['(G, S) is at most (log |G|)* for some 
absolute constant c. Even in the case of alternating, or, equivalently, symmetric 
groups, this has only been verified in very special cases. For permutation groups, 
the following are known. 


Theorem 3.46. Let G < S, be generated by the set S. Then the diameter of 1(G, 8S) 
is not greater than: 
(a) ¢,n’, if all members of S are cycles of lengths < k (Driscoll and Furst 1987); 
(b) cn**, if all members of S have degree < k (McKenzie 1984); 


(c) exp(VnInn(1+0(1))), if no assumption on § is made (Babai and Ser- 
ess 1988). 


The bound in (c) is asymptotically tight, as shown by the cyclic group generated 
by the product of cycles of prime lengths 2,3,...,p; where 2+3+---+p; <n< 
2+3+---+p;,,. Unfortunately, however, no better bound is known for G= S,, 
either. 


We can do much better if the generators are chosen at random rather than 
adversarially. 


Theorem 3.47, Let o, t be two randomly selected permutations of a set of n elements 
and G = (a,7). 

(a) With probability 1 — O(1/n), An < G < S, (Dixon 1969). (The error term is 
from Babai 1989.) 

(b) With probability 1 - 0(1), the diameter of ['(G,S) is at most n970+0(1)))/2 
(Babai and Hetyei 1992). : 


To appreciate the difficulty of determining the exact diameter with respect to a 
given set of generators, we mention two results on the computational complexity 
of this problem. For a permutation group G < S,, it is NP-hard to determine 
the diameter of ['(G,S) even if G is an elementary abelian 2-group (Even and 
Goldreich 1981). For Cayley digraphs of permutation groups, it is a PSPACE- 
complete problem to determine the directed distance of a given pair of group 
elements (Jerrum 1985). 


3.9. Automorphisms of maps 


Note. In this section, both finite and infinite graphs will be considered. All sur- 
faces (2-dimensional manifolds) considered are closed (without boundary) and 
compact, with the significant exception of the plane. 
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One has to make a distinction between the automorphism groups of graphs 
embeddable on a surface + and the automorphism groups of the maps defined by 
specific embeddings X c 2. 

Recall (cf. chapter 5) that a map is a graph X embedded on a surface ¥ such 
that the components of »\ X, the faces of the map, are homeomorphic to an 
open disc. If the surface 4 is compact, X¥ must be finite. Map-automorphisms 
preserve incidences between edges and faces in addition to those between edges 
and vertices. If v, e, and f denote the number of vertices, edges, and faces, resp., 
of a map on a compact surface then 


v—e+f=x (7) 


where y = x(2) denotes the Euler characteristic of X. 

Recall that y < 2 is an integer. If 2 is orientable then y is even; the quantity 
g=1- y/2 is the genus of 2%; and X is homeomorphic to the “sphere with g 
handles”. If 2 is non-orientable then g’ = 2 — y is its non-orientable genus; and 
& is homcomorphic to the “sphere with g’ crosscaps”. Thus, orientability and the 
Euler characteristic characterize all compact surfaces up to homeomorphism. 

The compact surfaces of non-negative Euler characteristic are the following: 
(a) orientable: the sphere (y = 2) and the torus (y = 0); (b) non-oricntable: the 
projective plane (x = 1) and the Klein bottle (y = 0). 

Map-automorphisms extend isomorphically to groups of homeomorphisms of 
, and conversely: every finite group G acting on a compact surface ¥ acts as a 
vertex-transitive group of automorphisms of some map. Unless » is the sphere, we 
may require in addition that every face has at least 3 sides. (For instance, if G is 
the trivial group and & is the torus, we shall have a single vertex with two loops, 
creating a single four-sided face.) 

Each non-orientable surface 3, has an orientable double cover 3», of Euler 
characteristic 2y(2,). The action of any group G on 3, can be lifted isomorphically 
to an orientable action on 2. The action of G on 3) commutes with the sense- 
reversing “antipodal map” which switches the pairs of preimages of the covering 
map 3) — 3), hence G x Z, acts on 2. These facts follow from the elements of 
homotopy theory; cf. Tucker (1983, p. 96). 

To understand finite group actions and maps on compact surfaces, we need 
to look at the three natural geometries: the sphere, the euclidean plane, and the 
hyperbolic plane. (These are the only simply connected 2-dimensional complete 
Riemannian manifolds of constant curvature.) 

Let G be a finite group of homeomorphisms of the compact surface ¥ of Eu- 
ler characteristic y. Then ¥ admits a G-invariant Riemannian metric of constant 
curvature. The curvature will be positive, zero, or negative according to the sign 
of y(2). This makes our surface ¥ locally isometric.to the corresponding natural 
geometry. 

Moreover, if M is a vertex-transilive map on 2, invariant under G, without one- 
sided or two-sided faces, then the metric can be chosen so as to make all egdes 
geodetic and all faces regular. (Cf. Jones and Singerman 1978, and the proof of 
Zieschang et al. 1980, Theorem 6.4.7.) 
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More about the groups and the maps on & can be found out by lifting them to 3, 


the universal covering space of 2, which is the natural geometry locally isometric: - 


to >. 


We define a crystallographic group of a natural geometry as a discrete group of 
isometrics with compact fundamental domain!. 


Theorem 3.48. Let G be a finite group acting on the compact surface &. Then G 


lifts to a crystallographic group G of the natural geometry of its universal cover” 


(sphere, euclidean plane, or hyperbolic plane). 


(Cf. Zieschang et al. 1980, Theorem 6.4.7.) The fundamental group 7 (4) is 
normal in G and G/7(%) = G. 

An Archimedean tiling of a natural geometry is a map of which each face is a 
regular polygon and the map admits a vertex-transitive group of isometries. 


Theorem 3.49. A vertex-transitive map M on a compact surface & lifts to an 
Archimedean tiling of the natural geometry of the universal covering surface &. 


(Cf. the proof of Zieschang et al. 1980, Theorem 6.4.7.) 

One can classify the crystallographic groups of the three natural geometries via 
canonical codes; each code is associated with a presentation in terms of generators 
and relations derived from a pair of dual maps. If two such groups are isomor- 
phic as abstract groups then their isomorphisms are also geometrically realizable 
(Wilkie 1966, Macbeath 1967, cf. Zieschang et al. 1980, Theorems 4.5.6-4.7.1). 

The crystallographic groups of the sphere are finite; they are listed in section 1.4. 
There are 18 individual types and two infinite one-parameter families of vertex- 
transitive maps on the sphere, corresponding to the Platonic and Archimedean 
solids and the families of prisms and antiprisms. 

By the foregoing remarks, we obtain that the finite group actions on the projec- 
tive plane are precisely the actions, on the pairs of antipodal points, of the finite 
rotation groups of the sphere. 


Vertex-transitive maps on the projective plane correspond to centrally symmetric: 


vertex-transitive maps on the sphere and are obtained from them by identifying... 


antipodes. 

When (+) = 0, Theorem 3.48 relates G to the classical crystallographic groups 
of the euclidean plane. These were classified in the last century (Fedorov 1891). 
There are (up to natural equivalence) 17 of them (see Coxeter and Moser 1972, 
p. 44). Each crystallographic group G is equivalent to a group of isometries of the 
plane acting transitively on the points of a regular triangular, square, or hexagonal 
grid. It follows that the index of G is not greater than 12, 8, and 6, resp., in the full 
group of symmetries of the corresponding grid. Furthermore, G contains a normal 
subgroup N generated by two linearly independent translations, and the quotient 
G/N is a subgroup of the dihedral group of degree 6 or 4. 


' ‘This deviates from common usage in the hyperbolic case where compactness is usually not required. 
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Every normal subgroup H of G generated by two linearly independent trans- 
lations gives rise to a unique action of G/H on the torus R?/H; this observation 
describes all finite-group actions on the torus. (Note, in particular, that all these 
groups are solvable.) 

The situation with the Klein bottle is similar except that the normal subgroup 
H «G must be generated by a translation and a glide-reflection, i.e. a translation 
followed by a reflection in an axis parallel to the direction of the translation. This 
implies severe restrictions on G. One can prove, in particular, that the square of 
any rotation belongs to H, and the subgroup T < G of translations has a subgroup 
T, of index < 2 such that 7, /(7, 1 //) is cyclic. 


Corollary 3.50. (a) Let G be a finite group acting on the torus. Then G has an 
abelian normal subgroup N with < 2 generators such that G/N = Z, or Dy, k =6 
ork <4. 


(b) Let G be a finite group acting on the Klein bottle. Then G = Z,, Dy, Zan X Za, 
or Dy, x Zp. 


There are 11 types of Archimedean tilings of the euclidean plane (see fig. 1). 
Each of the tilings gives rise to a 2-parameter family of vertex-transitive toroidal 
maps. 

The vertex-transitive maps without one-sided and two-sided faces on the Klein 
bottle form 13 families corresponding in different ways to 6 out of the 11 vertex- 
transitive euclidean tilings; each of them have “width 4” in the sense that all 
vertices belong to 4 straight lines parallel to the glide-reflection axis on the Klein 
bottle (Thomassen 1991, Babai 1991c). In a similar sense, the degenerate cases 
have “width” 2 or 1 and are also known. 

When y(%) < 0, the finite groups acting on 3 are quotients of discrete subgroups 
of PGL(2,R), the isometry group of the hyperbolic plane. A classical theorem of 
Hurwitz (1893) indicates a drastic change comparcd to the case y > 0. 


Theorem 3.51 (Hurwitz). /f the finite group G acts on the compact surface & of 
Euler characteristic y < 0 then |G| < 84|y\. 


For a proof when »% is orientable, scc, e.g., Gross and Tucker (1987 p. 496). The 
general case follows by the foregoing remarks. There are infinitely many values of 
x where the bound 84]y| is attained (Conder 1980). 

The following is a combinatorial generalization of Hurwitz’s Theorem. With 
each vertex of a map we associate a cyclically ordered list containing the number 
of sides of each face incident with the vertex. We call a map semiregular if the 
cyclic list associated with each vertex is the same (up to inversion). 


Theorem 3.52 (Babai 1991c). Let M be a semiregular map on a compact surface 
of Euler characteristic x <0. Then M has at most 84|y| vertices. 


Each homeomorphism # of a compact orientable surface 3 of genus g induces an 
automorphism ¢, of the first homology group H,(2) & Zs which preserves a skew- 
symmetric bilinear form H;(2) x H,(2) — Z, defined by the intersection numbers 
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types of Archimedean tilings of the euclidean plane (one of them shown in two 


t-symmetrical forms). After Griinbaum and Shephard (1981 p. 144). 
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of curves, cf. Zieschang et al. (1980, Proposition 3.6.3). Another result of Hurwitz 
(1893) states that if y has finite order and g > 2 then %& #4 1. Consequently, if G 
is a finite group of homeomorphisms of 2 then G is isomorphic to a subgroup 
of Sp(g, Z), the group of 2g x 2g integral symplectic matrices (cf. Zieschang et al. 
1980, Corollary 4.15.3, Biggs 1972). This result also holds for the homology groups 
mod n for any n > 3 (Serre in 1960, cf. Zieschang et al. 1980, Corollary 4.15.15). 


3.10. Embeddings on surfaces, minors 


Note. All graphs and groups in this section are finite, except in the last paragraphs 
(beginning after Theorem 3.59). 


Now we turn to the question of classifying the connected vertex transitive graphs 
embeddable on a given surface. If an embedding of the graph X on the surface X 
creates a map and all automorphisms of X extend to map-automorphisms then we 
call the embedding automorphic. The main difficulty is that embeddings are seldom 
automorphic. Some of the most surprising results in the area infer the existence 
of automorphic embeddings from seemingly unrelated asymptotic combinatorial 
assumptions. 

Apart from cycles, vertex p-transitive graphs have degree > 3 and are therefore 
3-connected. For planar graphs this implies unique embeddability on the sphere, 
hence those embeddings are automorphic, and the list of 18 types plus two infinite 
one-parameter families mentioned above applies. 

For no other surface 5 have the vertex-transitive graphs embeddable on ¥ been 
fully classified. Interest in embedding Cayley graphs on surfaces has been moti- 
vated since the last century by the following observation: Cayley graph embeddings 
help in finding presentations (generators and relations) for G. 


Proposition 3.53. Let X = I'(G,S) be embedded on %. Let the cycles C,,..., Cn of 
X through | generate the fundamental group of X. Let further D,,...,D;_, denote 
all but one of the fundamental cycles (face boundaries) of the embedding. Then the 
C; and the Dj, regarded as words in the symbols SUS ', form a complete set of 


relations defining G. If the map is vertex-transitive, only those Dj passing through 
1 have to be taken. 


Proof. Every cycle in the Cayley graph indicates a valid relation among the gen- 
erators. We have to show that every cycle A represents a consequence of the 
relations listed. We may assume A passes through 1. Then A, as a path on %, is 
homotopic to some product P of the C;. It follows that AP~' is contractible and 
therefore representable as a product of the Dj. O 


Maschke (1896) determined all planar minimal Cayley diagrams. (Recall that we 
call [(G, 8) and I.(G,S) minimal if S generates G with no redundant elements.) 
Nonplanar toroidal minimal Cayley diagrams have been classified by Proulx. Her 
list contains 11 infinite classes with two generators, 9 infinite classes with 3 gener- 
ators, 1 infinite class with 4 generators, and 9 sporadic cases (8 with 2 generators, 
1 with three). We state the main consequence. 
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Theorem 3.54 (Proulx 1977). All but 3 of the groups admitting a toroidal but 
no planar Cayley graphs are quotient groups of euclidean 2-dimensional crys- 
tallographic groups and therefore they actually admit automorphically embedded 
toroidal Cayley graphs. 


The precise set of 3 exceptions (of orders 24, 48 and 48) has been determined by 
Tucker (1984). Using in great detail Proulx’s analysis, Tucker went on to proving 
an extension of Hurwitz’s theorem to Cayley graphs embeddable on surfaces of 
negative Euler characteristic. 


Theorem 3.55 (Tucker 1984). Let G be a group of order n and ['(G,S) a minimal 
Cayley graph of G, embeddable on a surface X of Euler characteristic x <0 but 
not embeddable on the torus. Then |G| < 84|y|. 


We indicate some of the basic tricks of the Proulx-Tucker theory on a very 
simple special case. 


Proposition 3.56. Let G be a group of order n where GCD(n,6) = 1. Assume G 
has a minimal Cayley graph X embeddable on a surface & of Euler characteristic 
x. ifn > —5y, then G is abelian with two generators and X is toroidal. 


Proof. Let X = I'(G,S) have degree d > 3. Since G has no elements of order 3, 
the girth of G is > 4. Now X has n vertices, e = nd/2 edges, and f < nd/4 faces. 
Substituting into the Euler equation (7) we obtain 


n(i—d/4) > y. 


If d > 5, we infer n < —4y. If d = 3 then one of the generators would have to be 
an involution, impossible. The only remaining case is d = 4; hence e = 2” and S$ 
consists of 2 elements: S = {a, b}. 

Assume first that the girth of X is > 5. Let f; denote the number of i-sided faces. 
We then have 2e = }0,.< if; and f = 3°,,5 fi < 2e/5 = 4n/5. Hence . 


N=n-etf on—-2n+4n/5 = —nf/5. 


We conclude that n < —5y, thus finishing this case. 

We may henceforth assume that the girth of X is 4. By minimality, the implied 
relation of length 4 must be of one of the following types: (a) a? = 1; (b) abab = 1; 
(c) aba'b = 1; (d) a’b? = 1; (e) aba 'b } =1. 

Since G has odd order and S is minimal, only case (e) can actually occur. But 
then, G is abelian with two generators, hence it is toroidal. 0 


While the arguments that count degrees and use the Euler equation generalize 
to arbitrary vertex transitive graphs, the “relation chasing” that concluded the 


proof has no analogue. Arguments of a more geometric flavor, however, yield the 
following. 
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Theorem 3.57 (Thomassen 1991, Babai 1991c). There exists a function f such that 
every connected vertex-transitive graph X with more than f(x) vertices and em- 
beddable on a surface of Euler characteristic x admits an embedding as a vertex- 
transitive map on a surface of nonnegative Euler characteristic. 


The function f is bounded by c|y| where c is an absolute constant (Thomassen 
1991). With the exception of 4 families of “crossed stripe-like” graphs (Babai 
1991c), the embeddings guaranteed by the theorem are automorphic. 

Embeddings of specific Cayley graphs, in particular of complete graphs viewed 
as Cayley graphs of cyclic groups, have been studied extensively. The original 
motivation for this was the solution, due mainly to Ringel and Youngs (1968), of 
the Heawood map color conjecture. (For details and references, we refer to the 
monograph of Gross and Tucker 1987.) Subsequently, the following concept gained 
popularity. 


Definition. The genus of a (finite) group G is the minimum of the genera of those 
orientable compact surfaces ¥ on which some connected Cayley graph of G is 
embeddable. The non-orientable genus of G is the minimum of (2 — y(%)) over 
the corresponding not necessarily orientable surfaces 2. 


Both the orientable and non-orientable genera are monotone for subgroups 
(Babai 1977a). (This follows immediately from Proposition 3.58 below.) It is an 
open question whether or not the same holds for quotient groups, as conjectured 
by White (1973). Jungerman and White (1980) were able to determine the precise 
genus for surprisingly large classes of abelian groups, demonstrating that those 
groups admit embeddings with quadrilateral faces. The situation becomes more 
complicated when Z; factors are present and triangular faces may arise. 

Contractions tend to simplify the topological characteristics of a graph. Signifi- 
cantly, they can be related to group actions. 


Proposition 3.58 (The “Contraction lemma”). [f the group G acts semiregularly 
on the connected graph X then X has a contraction to some Cayley graph of G. 
In particular, if G < H then every Cay 
Cayley graph of G (Serre 1977, Babai : 


(Semiregular action means that the si 
immediate consequence is the Nielsen 


groups are free. BEOQNM90 por 
The Hadwiger number of a (not ne wp & 5 S SF yag 
of those values k such that some con .. 8 g ag & Pongly 
The Hadwiger number of graphs embe * $ Sg FA (iv) but 
for the torus, this bound is 7). The cor . e.g g 
give an asymptotic classification of all { g = a E 
Hadwiger numbers. Op e 
SS 
Theorem 3.59 (Babai 1994). There exi 5 V. The set of graphs 
described; their number 


nected vertex-transitive graph X of Hi - 
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admitting an automorphic embedding on the torus, or (b) ring-like in the following 
sense: V(X) has a partition (Vo,.-.,VWm— 1) into blocks of imprimitivity such that 
(b1) |V;| < f(k); (b2) if there is an edge between V; and V; then |i — j| < f(k) or 
m — |i — j\ < f(k); (b3) the action of Aut X on the set of blocks is either cyclic or 
dihedral. 


The proof requires the study of the local structure of the graphs via an infinite 
vertex-transitive “limit graph” (cf. Babai 1991c) and distinguishes cases according 
to the number of ends of the limit graph, using Proposition 3.31. The case of 
infinitely many ends is disposed of using a sphere packing argument (Babai 1991c) 
motivated by Thomassen’s proof that graphs of degree > 3 and large girth have 
large Hadwiger number (Thomassen 1983) 

The case of two ends yields ring-like graphs, using Dunwoody’s theorem 3.37. 


The hard case is when the limit has a single end. The analysis requires the following 
result. 


Theorem 3.60 (Thomassen 1992). Let X be an infinite locally finite connected 


vertex-transitive graph with a single end. If X has finite Hadwiger number then 
X is planar. 


Such an infinite graph, then, can be shown to have a natural associated geometry 
(along the lines of Theorem. 3.49, cf. Babai 1994): 


Theorem 3.61. Let X be an infinite locally finite connected vertex-transitive planar 
graph with a single end. Then X has an automorphic embedding as a tiling of the 
euclidean or hyperbolic plane. 


Returning to the sketch of the proof of Theorem 3.59, we observe that euclidean 
tilings give rise to toroidal graphs. Hyperbolic tilings lead to finite graphs of large 


Hadwiger number, via another sphere packing argument, using the elements of 
hyperbolic geometry. 


3.11. Combinatorial group theory 


Combinatorial group theory investigates presentations of groups defined in terms 
of generators and relations. Typical constructions in this ficld are the free product 
with amalgams, and HNN-extensions (cf. section 3.7). One of the classical results 
is the Nielsen—Schreier theorem that every subgroup of a free group is free. This, 
incidentally, follows immediately from the Contraction lemma (Proposition 3.58). 
Indeed, among the groups with no elements of order 2, precisely the free groups 
have trees for Cayley graphs; and a contraction of a tree is a tree again. 

There is no way we could do justice to this vast area here; the reader is referred 
to the monographs by Coxeter and Moser (1972), Magnus et al. (1966), Lyndon 
and Schupp (1977), Serre (1980), Dicks and Dunwoody (1989), The elementary 
graph theoretic approach to classical subgroup theorems is emphasized in Imrich’s 
(1977) friendly notes. When proving subgroup theorems such as Kurosh’s theorem 
stated at the end of this section, the basic geometric object to consider is the 
quotient of a Cayley diagram of the group G by the action of the subgroup H 
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(Schreier coset diagram). An example of an interesting result in this area proved 
by an elementary graph theoretic argument is Howson’s theorem: the intersection 
of two finitely generated subgroups of a free group is finitely generated (Imrich 
1977, Dicks and Dunwoody 1989, 1.8, Tardos 1992). 

Some of the results mentioned earlier in this chapter belong to combinatorial 
group theory (e.g., Proposition 3.53 or Dunwoody’s theorem 3.36). 

A relatively recent highlight is the Bass—Serre theory of group actions on trees. 
They introduce a construction called a graph of groups in which a group G(v) is 
assigned to each vertex uv of a directed graph and a subgroup G(v, w) < G(v) to 
every directed edge (v, w), along with an injective homomorphism ¢, ,,:G(v, w) > 
G(w). The fundamental group of a graph of groups is defined as a group generated 
by the disjoint union of the G(v) along with one symbol ¢,,,, for every edge (uv, w), 
subject to the relations defining G(v) for each v, and the relations t) het, y = g" 
for each edge (v,w) and element g € G(v, w). (Note that therefore g € G(v) and 
g’’" € G(w).) Moreover, we select an arbitrary maximal subtree of the graph, and 
set t, » = | for every edge (uv, w) in the tree. 

Observe that if the graph consists of a single directed edge (v,w) then the 
fundamental group will be the free product of G(v) and G(w), with the subgroup 
G(v, w) amalgamated. If the graph has a single vertex v with a loop (v,v) then the 
fundamental group is the HNN-extension (G(v), G(v, v), t,). This is a restatement 
of Theorems 3.34 and 3.35. 

Let now G be a group acting on a tree T without inverting edges. Then the Bass— 
Serre structure theorem asserts that G is isomorphic to the fundamental group of 
a graph of groups, where the graph is the quotient graph of T by the action of G 
(Serre 1980, section 1.5.4, section I.4). 

Among the immediate consequences is Kurosh’s classical subgroup theorem, 
asserting that a subgroup of a free product of the groups G; is a free product of a 
free group and conjugates of subgroups of the G;. 


3.12. Eigenvalues 


Let a:G — C be a function, and consider the n x n matrix A = (a,,), whose rows 
as well as columns are labeled by the elements of G (in the same order, n = |G]), 
and 


agh => a(gh"'). 


We can think of w as a “color assignment” to the elements of G; thus A is the 
adjacency matrix of a Cayley color diagram. We call A a G-circulant, since in the 
case G = Z, we obtain precisely the circulant matrices. 

In the circulant case, det A has a well-known expansion into linear factors. Let 


w denote a primitive nth root of unity; then the vectors w; = (1, @/,@7!,...,@—)) 
form a system of orthogonal eigenvectors of A, with corresponding eigenvalues 
Ai = Do alka. (8) 


k 
The determinant of A is J]; Aj. 
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Examining the expansion of det A for the dihedral groups, Dedekind noticed 
that (viewing cach valuc a(g) as an independent variable) most irreducible factors 
were no longer linear but quadratic, and called on Frobenius in a letter to investi- 
gate the general case. Frobenius soon found a wealth of structure; his paper “Uber 
die Primfactoren der Gruppendeterminante”, presented to the Prussian Academy 
of Sciences in 1896, laid the foundations of character theory for nonabelian groups. 

A consequence of this theory is, that, denoting the dimensions of the irreducible 
characters of G by m,...,7), (A is the number of conjugacy classes in G; and $7 n? = 
n), the eigenvalues of any G-circulant can be assigned to irreducible characters in 
the following way: n? eigenvalues correspond to character y;; these fall into 1; equal 
groups, and all the 7; eigenvalues within a group are equal. Moreover, the sum of 
the potentially different n; eigenvalues (one from each group of n;) belonging to 
x; is 


Ait tAin, = S} a(g)xi(g). (9) 


REG 


(See Babai 1979d.) In particular, if G is abelian, then each n; = 1, and the expres- 
sion simplifies to 


di = So a(e)xi(s), (10) 


REG 


a direct generalization of the circulant case (eq. (8)). 

As an example, let ¥ = X(n, k) denote the distance-k graph of the n-dimensional 
cube. Let A be an n-set and let us represent the elements of the n-cube by subsets 
of A. With the operation of symmetric difference, this set is the elementary abelian 
group Zj and X = 1(Z,S,,) where S;, is the set of all k-subsets of A. Characters 
xr:Z — {+1} are associated with subsets T C A via the rule y,;(B) = (~1)!"!. 
The corresponding cigenvalue of X is Ay = yyy (-1)!'""! = K,(T|), where 


k 
Ky = Dv’) (G4) (1) 


i=0 


is the Krawtchouk polynomial (cf. Bannai and Ito 1984, section 3.2). 

A more general class of G-circulants, admitting an explicit expression of their 
eigenvalues, are obtained when a is a class function, i.e. a is constant on con- 
jugacy classes. (In other words, a(gh) = a(hg) for every g,h € G.) In this case 
all the n? eigenvalues belonging to y; are equal, and hence their common value 
is A; = (1/n;) poeeti a(g)xi(g) and the matrix A is diagonalizable (via a unitary 
transformation). 

It follows from the above that if the set S of generators is closed under conju- 
gation then the adjacency matrix of the Cayley digraph I'(G, S) is diagonalizable. 
This is not true for general S; Godsil (1982) has shown that the minimal polynomial 
of any integral matrix divides the minimal polynomial of some Cayley digraph. 
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Cayley graphs of cyclic groups of prime order are determined up to isomorphism 
by their characteristic polynomials (Elspas and Turner 1970). This is not true for 
general groups; families of isospectral Cayley graphs of the dihedral groups of all 
odd prime degrees are exhibited in Babai (1979d). 

The results discussed above belong to the harmonic analysis over G. For an 
exposition and a variety of applications (especially to random walks), sce Diaconis 
(1988, chapter 3; 1989), Chillag (1988). 

For the extensive literature on the harmonic analysis over locally finite infinite 
graphs we refer to the survey Mohar and. Woess (1989). 


4. The representation problem 


The material of this section is covered in greater detail in the survey paper by 
Babai (1981b) where additional references and in many cases complete proofs can 
be found. — 


4.1. Abstract representation; prescribed properties 


In this section we consider the following type of problem: given a group G find 
a graph X (or a block design, a lattice, a ring, etc.) such that the automorphism 
group Aut(X) is isomorphic to G. Such an object X will be said to represent the 
group G. A class €@ of objects is said to represent a class G of groups if, given 
G € G there exists X € € such that Aut(X) = G. We call € universal, if every 
group is represented by €. We say that © is finitely universal if every finite group 
occurs among the groups represented by finite members of ©. 

The natural question, which groups are represented by graphs, was stated by 
K@6nig (1936), and soon answered by Frucht. 


Theorem 4.1 (Frucht 1938). Given a finite group G there exists a finite graph X 
such that Aut(X) & G. In other words, graphs are finitely universal. 


Frucht’s proof has been reproduced in several texts (Ore 1962, Harary 1969, 
Lovadsz 1979a, Bollobds 1979). The idea is (i) to observe that the automorphism 
group of the (colored, directed) Cayley diagram of G with respect to any set of 
generators is isomorphic to G; (ii) to get rid of colors and orientation by replacing 
colored arrows by appropriate small asymmetric (automorphism free) gadgets. 

The next problem was to find subclasses of graphs and classes of other (combina- 
torial, algebraic, topological) objects that are universal. This direction was initiated 
by Frucht and Birkhoff. Frucht (1949) proved that trivalent graphs are finitely uni- 
versal. It is immediate from Theorem 4.1 that posets are finitely universal. Since 
posets are strongly reconstructible from their lattice of ideals (as the poset of join- 
irreducible elements), it follows that distributive lattices are finitely universal (as 
well as universal, Birkhoff 1945). 

These results already foreshadow the lopsidedness of later developments. Take 
almost any “reasonably broad” class of combinatorial or algebraic structures; the 
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class will be universal. (Groups, planar graphs are notable exceptions.) This “uni- 
versality phenomenon” was first indicated by Sabidussi (1957), he proved that 
Hamiltonicity, k-regularity, k-connectedness are all compatible with any prescribed 
automorphism group. Universality results in topology and algebra were inspired by 
De Groot’s (1958, 1959) papers, where topological spaces and commutative rings 
were shown to be universal. A surprisingly strong version of the latter result was 
given by E. Fried and J. Kollar. 


Theorem 4.2 (Fried and Kollar 1978, 1981). Every group is the automorphism 


group of a field. Every finite group is the automorphism group of an algebraic 
number field. 


(Algebraic number fields are finite extensions of Q.) The proof takes a graph 
X with the given automorphism group and encodes it into a field (not without 
ingenuity). This is the basic scheme of most universality proofs. 

The extensions constructed by Fried and Kollar are not normal. Therefore their 
result does not bear on the inverse problem of Galois theory (represent a given 
group as a Galois group over a given field; notably, over @). We note in passing 
that the inverse problem has had its renaissance in the past decade, inspired by 
Thompson’s (1984) new approach (cf. Feit 1989, Matzat 1987, several articles in 
Aschbacher et al. 1985). One of Thompson’s corollaries states that the Monster, 
the largest sporadic simple group, is a Galois group over Q. 

Of the numerous combinatorial universality results, let me quote two of the 
more surprising ones. 


Theorem 4.3 (Mendelsohn 1978a,b). Every finite group is the automorphism group 
of (a) @ finite Steiner triple system and a finite Steiner quadruple system; (b) a finite 
strongly regular graph. 


Universality proofs usually require reconstruction arguments. To illustrate this 
point, we deduce Mendelsohn’s result (b) from (a). Let X be a Steiner triple system 
with the prescribed automorphism group G. Take its line graph L(X). L(X) is 
strongly regular, and, according to Theorem 1.12, X is strongly reconstructible 
from L(X), assuming X has > 15 vertices. In particular, Aut(X) © Aut(L(X)) 
(Corollary 1.13(a)). oO 


The automorphism group is very sensitive to slight changes in the graph. It is 
known, for instance, that for any pair of groups G and H there exists a graph 
X and an edge e € E(X) such that Aut(X) & G and Aut(X \ e) & H (Babai, see 
Lovasz 1979a, Example 12.11). 

Jt is typical for universality proofs that the group structure plays a small role. 
The extent to which group structure can be ignored is demonstrated by general- 
izations to prescribability of semigroups of endomorphisms and even categories, 
pioneered by the Prague category theory school, especially Pultr and Hedrlin. A 
homomorphism of the graph X to the graph Y is an adjacency preserving map 
V(X) — V(Y). Note that non-adjacent vertices may have the same image. En- 
domorphisms of a graph X are homomorphisms X — X. They form the monoid 
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End(X). (Monoid = semigroup with identity.) The basic result is that every monoid 
is the endomorphism monoid of some graph (finite graphs for finite monoids) 
(Hedrli, Pultr, Vopefika). By encoding graphs, many classes of algebraic and topo- 
logical structures have been shown to have the same property (see the monograph 
by Pultr and Trnkova 1980). A nice introduction to the subject is Hedrlin and 
Lambek (1969). 

Universality-type results are known for some classes of structures that are clearly 
not universal. 


Theorem 4.4. (a) The automorphism groups of (finite) tournaments have odd 
order, and every finite group of odd order is represented by a tournament (Moon 
1964). 

(b) G is the automorphism group of a switching class of tournaments if and only 
if its Sylow 2-subgroups are cyclic or dihedral (Babai and Cameron 1994a). (Two 
tournaments T,, T, on the common vertex set V are switching equivalent if V can be 
partitioned into two classes such that one obtains T, from T, by reversing all edges 
between the two classes. This equivalence relation divides the set of tournaments on 
V into switching classes.) 

(c) Denote by IT, the class of groups G with a subgroup chain G = Go > G, > 

«+ > Gm = 1 such that |G; .\:G;| < d for every i. If X is a connected regular graph 
of degree d +1 then the stabilizer of an edge in X belongs to I'y, and every group 
in Fy can be represented this way (Babai and Lovasz 1973). 


It is an open problem to show that the finite projective planes are not finitely 
universal, i.e., not every group is isomorphic to the automorphism group of a finite 
projective plane. Indeed it seems plausible that most finite groups cannot act on a 
finite projective plane (as a subgroup of the automorphism group), but no group 
has been ruled out so far. Hering (1967) proved that for n = 3 (mod 4), any 2-group 
acting on a projective plane of order n must be cyclic, a (generalized) quaternion 
group, a dihedral group, or a quasidihedral group. 


4.2. Topological properties 


Topological properties of a graph (embeddability on a surface, excluded minors) 
do restrict the abstract group of automorphisms and thus offer a welcome source 
of connections between the structure of groups and the graphs representing them. 
The following general non-universality result says that prescribed automorphism 
groups force arbitrary minors to occur. 


Theorem 4.5 (Babai 1974a). Given a finite graph Y there exists a finite group G 
such that every graph X with Aut(X) = G has Y as a minor. 


Let €(Y) be the class of finite graphs without Y as a minor. It is expected that the 
finite groups represented by €(Y) have a very restricted structure. In particular, it 
is conjectured that the list of nonabelian finite simple groups represented by @(Y) 
is finite (cf. Babai 1981b). 
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Excluding a topological subgraph is, in general, less restrictive than excluding a 
minor; indeed even the exclusion of vertices of degree > 4 does not restrict the 
abstract automorphism group. However, prescribed endomorphism monoids do 
force arbitrary topological subgraphs: another strong non-universality result. 


Theorem 4.6 (Babai and Pultr 1980). Given a finite graph Y there exists a finite 


monoid M such that for every graph X, if End(X) = M then X contains a subdi- 
vision of Y. 


4.3. Small graphs with given group 


The number of orbits is a measure of symmetry. It is natural to ask, how sym- 
metrical the graphs representing a given group can be. By the orbits of a graph, 
we mean the orbits of its automorphism group on the vertex set. Edge-orbits are 
orbits on the edge set. 

With three exceptions, every finite group can be represented by a graph with 
< 2 orbits (Babai 1974b). (The exceptions are the cyclic groups of orders 3, 4, and 
5.) 

Most groups even admit a representation by a vertex-transitive graph. Nowitz (1968) 
and Watkins (1971!) described an infinite family of groups without a vertex- 
transitive representation (abelian groups of exponent greater than 2, and gen- 
eralized dicyclic groups). Hetzel (1976) and Godsil (1981a) proved that apart from 
these, there is only a finite number of additional exceptions, each of order < 32. 
Godsil (1979) extended this result to finitely generated infinite groups. 

A graphical regular representation (GRR) of a group G is a graph X such that 
Aut(G) is regular and isomorphic to G. In other words, X is a Cayley graph of 
G without “extra” automorphisms (all automorphisms correspond to right trans- 
lations, cf. section 3.1). 

The graphs Hetzel and Godsil construct are actually GRRs and the result stated 
constitutes the full solution of the GRR problem: the characterization of all fi- 
nite groups which admit a GRR. For certain classes of groups G, including all 
nonabelian nilpotent groups of odd order, one can actually show that almost all 
Cayley graphs of G are GRRs (Babai and Godsil 1982). (To obtain a random 
Cayley graph 1'(G,S), one chooses a symmetrical set S = § ' C G at random.) 

The analogous problem for digraphs is easier: with 5 exceptions, all groups (finite 
or infinite) have a digraphical regular representation (Babai 1978c, 1980a). (The ex- 
ceptions are the elementary abelian groups of orders 4, 8, 9, 16, and the quaternion 
group of order 8. For infinite groups, the proof employs infinite Ramsey theory, 
cf. chapter 42.) A consequence is that every infinite group can be represented by 
a graph X with 3 orbits. 

The situation is quite different when we wish to minimize the number of edge- 
orbits. First of all, if X is a graph representing the group G with a semiregular 
automorphism group (as has been the case so far in this section as well as in most 
constructions related to section 4.1) then the number of edges of X is at least nd/2, 
where n = |G| and d is the minimum size of a symmetrical set of generators. (This 
is a consequence of the Contraction lemma 3.58.) 
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Let e(G) denote the minimum number of edges and m,(G) the minimum number 
of edge orbits of the graphs. representing G. Clearly e(G)/{G| < m,(G) and it is easy 
to see that m.(G) < C log |G}. Two natural questions arise: (a) is m,.(G) bounded? 
(b) Is e(G)/|G| bounded? We now have a fairly complete answer to both questions. 


Theorem 4.7. (a) (Babai and Goodman 1991) For all finite groups, e(G)/|G| < 500. 

(b) (Babai, Goodman and Lovasz 1991) If a finite group is generated by k abelian 
subgroups then m.(G) < Ck for some absolute constant C. (Note that, e.g. any 
direct product of finite simple groups is generated by k = 2 abelian subgroups.) 

(c) (Goodman 1993) There is a constant c > 0 such that for infinitely many finite 
groups G, m.(G) > c,/log |G}. 

(d)(S. Thomas 1987) Assuming the generalized continuum hypothesis, for every 
successor cardinal x there exists a group G of order « such that m.(G) = x. 


The proof of (b) is related to a generalization of a result by Gel’fand and Pono- 
marev (1970) that the subspace lattice of a vector space of finite dimension 3 3 
over a prime field is generated by 4 subspaces. The proof of (c) has a curious 
nonconstructive element: certain p-groups of class two, demonstrating the lower 
bound, are shown to exist by a probabilistic (counting) argument. No explicit fam- 
ily of finite groups with unbounded m,(G) is known. The groups required for the 
proof of (d) are Jonsson groups, i.e. groups having no proper subgroups of their 
own cardinality. Shelah (1980) proved the existence of such groups for successor 
cardinals under GCH. Without GCH, no proof is known of the conjecture that 
m.(G) can be an arbitrarily large cardinal. 


Some classes of groups are represented by drastically smaller graphs. This is clear 
for the symmetric groups (graphs of order k represent the group of order n = k!), 
but less evident for the alternating groups (graphs of order < 2**! represent the 
alternating group of order k!/2). Liebeck (1983) determines the exact minimum 
order of graphs representing the alternating group A, for sufficiently large k (e.g., 
for k =0,1 (mod 4) he finds this minimum to be 2‘ — k — 2). (For small k, there 
are surprises, e.g., Ay = PSL(4, 2) is the automorphism group of a 30-vertex graph: 
the incidence graph of the projective geometry PG(3, 2).) Liebeck also gives strong 
lower bounds for the minimum order of graphs representing 3 types of classical 
simple groups (linear, orthogonal, unitary). 

We mention related open problems. Let G be a group of order n. It follows from 
part (a) of the above theorem that G can be represented by a lattice of size O(n). 
Can G be represented (i) by a lattice with a bounded number of orbits? Can G 
be represented by a polynomial size (n°) (ii) Steiner triple system, (iii) strongly 
regular graph, (iv) modular lattice? We conjecture the negative answer to (iv) but 
positive answers to (i)-(iii). 


4.4. The concrete representation problem, 2-closure 


Let G < Sym(V) be a permutation group, acting on the set V. The set of graphs 
X =(V, E) admitting G as a subgroup of Aut(X) is easily described; their number 
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is 2* where k is the number of orbits of the induced action of G on the set of (MS) 
pairs. 

The concrete representation problem asks if G = Aut X for some graph (digraph, 
etc.) with vertex set V. This problem is very difficult in general, as the case of 
regular permutation groups (the GRR problem, section 4.3) has demonstrated. 
But there is a simple necessary condition. 

Let us consider the colored complete directed graph W with vertex set V ob- 
tained from G as follows. Two pairs of vertices receive the same color if and only 
if they belong to the same orbit of the induced action of G on V x V. Vertex u 
receives the color of the pair (u,v). This is the coherent configuration correspond- 
ing to the group G. We define W* to be the undirected version of W: unordered 
pairs receive colors. 

We call Aut(W) the 2-closure of G, and Aut(W*) the 2*-closure. In other words, 
the 2-closure of G is the largest subgroup of Sym(V) with the same orbits on 
V x V; and the 2*-closure the largest subgroup with the same orbits on points and 
on unordered pairs. 

The group G is 2-closed if G is cqual to ts 2-closure; 2*-closed groups are 
defined analogously. A group is 2-closed if and only if it is the automorphism group 
of a colored directed graph; and 2*-closedness corresponds to colored undirected 
graphs. These are thus necessary (but not sufficient) conditions for the group to 
be the automorphism group of a digraph (graph). 

All regular permutation groups are 2-closed. Not all of them are 2*-closed; the 
exceptions are precisely the abelian groups of exponent greater than two and the 
generalized dicyclic groups (Nowitz 1968, Watkins 1971, Babai 1977b). 

For transitive permutation groups G, Godsil (1981b) gives further necessary 
conditions which for some class of nilpotent groups turn out also to be sufficient. 

It is an interesting question, how far the 2-closure cly(G) is from a group G. 
Liebeck, Praeger and Saxl (1988) investigate this for the case when G is primitive 
and almost simple, i.e. LG < Aut(L) for some simple group L. If G is 2-transitive 
then cl,(G) = Sym(V); but the gap is much smaller in all other cases. Indeed 
Liebeck et al. (1988) find that cl(G) normalizes G, with the exception of six 
sporadic cases (the largest degree occurring in a representation of degree 276 of 
the Mathieu group M4) plus two surprising infinite families of unbounded ranks 
with socles L = G2(q) and ,(q), resp. 

The notion of 2-closure as a tool in the study of permutation groups was intro- 
duced by I. Schur, see Wielandt (1969). 

A maximal, not doubly transitive subgroup of 'S, is necessarily 2-closed. This 
observation was used by L. A. Kaluzhnin and M. H. Klin in 1972 (cf. Klin et al. 
1991) to give elementary proofs of the maximality of several classes of primitive 
groups, including the induced action of S,, on k-tuples (n = (/;)), with some re- 
strictions on (m,k). (For a complete study of this question via the classification of 
finite simple groups, see Liebeck, Praeger and Sax! 1987a.) 

It is natural to ask which permutation groups arise as the automorphism groups 
of a hypergraph. If the sizes of the edges are not restricted, we have a nearly 


Automorphism groups, isomorphism, reconstruction 1503 


complete answer for primitive groups. Obviously, .A,, 4 Aut(X) for any hyper- 
graph X on n vertices. Apart from the alternating groups and an (unknown) fi- 
nite family of other exceptions, all primitive groups G occur as Aut(X) for some 
edge-transitive hypergraph (Babai and Cameron 1994b). Exceptions include all set- 
transitive groups: the Frobenius group of order 20 (n = 5), PGL(2,5) (n = 6), 
PGL(2, 8), PFL(2,8) (n = 9). Another exception is the Frobenius group of order 
21 (n=7). 


5. High symmetry 


As in the Introduction, we shall use the abbreviation CFSG to indicate the clas- 
sification of finite simple groups. CFSG has played a decisive role in the recent 
development of some of the subjects to be discussed below; we shall try to indicate 
where this is the case. 


5.1. Locally s-arc-transitive graphs 


All graphs in this section will be assumed finite and connected. 

Lets > 1. An s-arc starting at a vertex vy in a graph X is a sequence (ug,..., Us) 
of vertices such that v;_, and vu; are adjacent (1 <i <s) and v;_; 4 vj, (1 <i< 
s—1). A group G < Aut(X) is locally s-arc-transitive on X if for every vertex vo, 
the stabilizer of vy in G acts transitively on the s-arcs starting at up. If in addition 
G is vertex-transitive then G is s-arc-transitive. Otherwise X is clearly bipartite 
and G acts transitively on each color-class. For s = 1, s-arc-transitivity is the same 
as flag-transitivity. 

X is called (locally) s-arc-transitive if the action of Aut(X) is (locally) s-arc- 
transitive. We shall always assume that X is not a cycle (which is s-arc-transitive 
for every s). 

Having excluded the cycles, local s-arc-transitivity implies large girth: the girth 
must be > 2s — 2. Hence in a locally s-arc-transitive graph, all s-arcs are paths. 

For trivalent s-arc-transitive graphs, Tutte (1947) proved the surprising result 
that s must be bounded: s < 5 (cf. Biggs 1974, chapter 18). He showed that s = 5 
is attained by a graph Cy called an “8-cage”, a trivalent graph of girth 8 with 1440 
vertices; Aut(Cy) & Aut(S,) where S, is the symmetric group of degree 6 (cf. Biggs 
1974, p. 125). By the covering construction of Theorem 1.6 we infer that there are 
infinitely many trivalent 5-arc-transitive graphs. 

Tutte’s result was generalized to locally s-arc-transitive graphs in a remarkable 
self-contained 4-page paper by R.M. Weiss (1976). 


Theorem 5.1 (R.M. Weiss). Let G be a locally s-arc-transitive but not (s + 1)-arc- 
transitive group acting on a trivalent graph. Then s <7 and s # 6. 


The bound 7 is attained by the 12-cage (Tits 1959, Appendix, cf. Benson 1966). 
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A group G is (locally) s-regular if G is (locally) s-arc-transitive and the stabilizer 
of each s-arc is the identity. This is a somewhat artificial concept except for degree 
3 when it occurs naturally: a trivalent edge-transitive graph is locally s-regular 
for some s. Let G be a locally s-regular group on a trivalent graph, and G, a 
vertex-stabilizer; then |G,| = 3 - 2°, the number of s-arcs starting at v. Weiss’ bound 
5S < 7 thus implies that there is only a finite number of possibilities for the vertex 
stabilizer in a trivalent edge-transitive graph. These possibilities were > classified by 
Tutte for the flag-transitive (and therefore s-regular) case. 

For the edge-transitive (and therefore locally s-arc-transitive) case the object to 
be classified is the pair of vertex-stabilizers of an adjacent pair of vertices together 
with their intersection (G,, Gy, G, G,). Goldschmidt (1980) classified all these 
triples and found that there were precisely 15 of them. Goldschmidt’s 30-page 
work is motivated by the examples afforded by the (bipartite) incidence graphs 
of “buildings” associated with rank-2 BN pairs over GF(2), occurring in the study 
of certain classes of groups of Lie type. Goldschmidt’s “amalgam method” was 
the starting point of an important new theory (Delgado et al. 1985), used among 
others for some aspects of “revisionism”, a project aiming at a clean and simplified 
proof of CFSG (Gorenstein et al. 1994). 

Tutte’s 1947 theorem was extended a third of a century later to s-arc-transitive 
graphs of arbitrary degree: R. M. Weiss showed, using heavy guns, that s < 7 holds 
for s-arc-transitive graphs of arbitrary degree (Weiss 1981). Noting that the stabilizer 
G, of a vertex v in a locally 2-arc-transitive group G acts doubly transitively on 
the neighbors of v, he was able to invoke the classification of the doubly transitive 
permutation groups, available as a consequence of CFSG (cf. chapter 12). Weiss 
proves that if s > 4 then the action of G, on the set X(v) of neighbors is either 
affine (has an elementary abelian normal subgroup; in particular the degree is a 
prime power), or it includes the linear fractional group PSL(2, p*) as a normal 
subgroup in its action on the projective line of {X(v)| = p* +1 points. Here either 
s=4,orp<3ands<2p+1. 

One of the key ingredients in much of the work on arc-transitive graphs was 
the following theorem, magically singling out a prime number, characteristic for 
the graph. The result is due to J. G. Thompson and H. Wielandt and was adapted 
by Gardiner (1973) in this context (cf. Brouwer et al. 1989, chapter 7.2). For a 
subset S C V(X), let X“(S) denote the set of vertices within distance d from S 


(so, e.g., x° (S) = S). We use G,(S) to denote the pointwise stabilizer of X“(S) in 
G < Aut(X). 


Theorem 5.2 (Thompson, Wielandt). Let G < Aut(X) act vertex-transitively on the 
connected graph X which is not a cycle. Assume that the stabilizer Gy of each vertex 
v acts as a primitive group on the set of neighbors of v. Then there exists a prime p 
such that G,(e) is a p-group (possibly the identity) for every edge e of X. 


Weiss (1979) eliminated the condition of vertex-transitivity and proved that un- 
der this weaker assumption (which is implied by /ocal 2-arc-transitivity) G2(v) is a 
p-group for some vertex v. 
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No analog of Weiss’ s <7 bound is known for locally s-arc-transitive graphs 
of arbitrary degree. The significance of such an extension would be in its wider 
applicability which would include incidence graphs of geometries of high symmetry. 
Such an application of the following partial result of Weiss will be indicated in 
Theorem 5.5. We should stress that Weiss’ proof is elementary. 


Theorem 5.3 (Weiss 1979). Let G < Aut(X) be a locally s-arc-transitive group act- 
ing on the connected graph X of girth g. Assume s >8 and g <2s+11. Then 
G5(S) = 1 for every arc S of length 14. 


5.2. Distance-transitive graphs 


This is one of the deepest and most extensively studied arcas. We refer to 
Biggs (1974) for an introduction and to the recent monographs by Brouwer, Co- 
hen and Neumaier (1989) and Bannai and Ito (1984) for technical discussions. The 
techniques are partly combinatorial and algebraic (adjacency algebras) and apply 
in greater generality to distance regular graphs (cf. chapter 15, section 4); partly 
group theoretic (both elementary and CFSG-dependent). 

First we mention that the infinite distance-transitive graphs of finite degree have 
a very simple structure. For r,s > 2, an r-tree of s-cliques is an infinite connected 
graph all of whose 2-connected blocks are s-cliques and each vertex belongs to 
exactly r of these cliques. 


Theorem 5.4 (Macpherson 1982). Every infinite distance-transitive graph of finite 
degree is an r-tree of s-cliques for some r,s > 2. 


Macpherson’s proof is based on Dunwoody’s theorem on cuts of graphs with 
more than one end (Theorem 3.37). (Cf. Ivanov’s theorem below.) In constrast, 
a great variety of infinite distance-transitive graphs of infinite degree follows by 
Fraissé’s theorem (Theorem 5.8) (Cameron, cf. Brouwer et al. 1989, p. 233). Hence- 
forth in this section we assume that our graphs are finite. (Exception: Theorem 5.6.) 

Recently, a project aiming at the complete classification of all distance-transitive 
graphs was drawn up (see the survey by Praeger 1990). There are two phases to 
this project: to classify vertex-primitive distance-transitive graphs; and to reduce 
the general case to these. The program of the first phase was layed out by Praeger, 
Saxl and Yokoyama (1987) who reduced the problem to cases when the automor- 
phism group is either almost simple or affine (has an elementary abelian normal 
subgroup). As a result of combined efforts of Ivanov, Van Bon, Cohen, Inglis, 
Liebeck, Praeger, Sax! and others, most of the resulting cases have been scttled 
and this phase now approaches completion (cf. Praeger 1990 for references, and 
Liebeck, Praeger and Saxl 1987b as an example). 

The second phase has not advanced nearly as far but its basic idea is classical. 

A graph X of finite diameter d is antipodal if being at distance d is an equivalence 
relation among the vertices of X. Antipodal graphs X of diameter d > 2 are not 
vertex-primitive since X“) is disconnected. (In X“), two points are adjacent if they 
are at distance k in X.) 
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The study of distance-transitive graphs can, in a sense, be reduced to the vertex- 
primitive case, by a result of D. H. Smith and N. J. Martinov which asserts that a 
distance-transitive graph of degree > 3 is either primitive, or bipartite, or antipodal. 
(Cf. Brouwer et al. 1989, chapter 4.2.) It follows that starting from a distance- 
transitive graph, two simple operations will eventually lead to a vertex-primitive 
one. If X is antipodal, we identify antipodes and obtain a distance-transitive graph 
covered by X. If X is bipartite then X® has two isomorphic components, both 
are distance-transitive (X is a bipartite doubling of these components). Bipartite 
doublings have been studied in a number of recent papers (see, e.g., Hemmeter and 
Woldar 1990). Gardiner’s (1974) paper initiated the study of antipodal covers. He 
showed in particular that the size of the antipodal equivalence class is not greater 
than the degree. Antipodal coverings of some classes were classified recently (see 
Liebler 1991, Van Bon and Brouwer 1987). 

One of the remarkable general results in this area, predating the classifica- 
tion project indicated, is a classification of distance-transitive graphs by their de- 
gree. In 1974, Biggs and Smith (1971) determined all trivalent distance-transitive 
graphs (there are 12 of them). Smith (1974) went on to determining all tetravalent 
distance-transitive graphs. Mostly by work of FaradZev et al. (1984) and Ivanov 
and Ivanov (1988), all distance-transitive graphs of valency < 13 are now known. 


Theorem 5.5 (Cameron, Praeger, Sax] and Seitz 1983, Cameron 1982). There are 
finitely many distance-transitive graphs of any given degree d > 3. 


For the primitive case, this is immediate from Sims’ conjecture (Theorem 1.1, 
depending on CFSG). Cameron (1982) points out that the general case rapidly fol- 
lows, observing that from a distance-transitive graph of degree k > 3 the two oper- 
ations mentioned above (halving, antipodal quotients) lead to a primitive distance- 
transitive graph of valency 3 < k’ < k(k — 1) in at most two steps. 

Remarkably, Weiss (1985b) found a proof of Theorem 5.5 avoiding the CFSG 
reference, based on one of his results on s-arc-transitive graphs (Theorem 5.3), 
combined with the following powerful elementary result of A. A. Ivanov. 

A graph X = (V, E) is distance-regular if parameters a;, b;,c; exist such that for 
each vertex v € V, every vertex at distance i from u has c¢;, a;, and b; neighbors 
at distance i— 1, i, and i+1 from v, resp. Distance-transitive graphs are clearly 
distance-regular. We consider the parameter t = sup {i: (@;, bj, ¢;) = (a1, 51, C1) }. (It 
is clear that g < 2 +3 where g is the girth. If g > 4 then (a,b), c,) = (0,k — 1,1) 
and 2t+2<g< 2t+3.) 


Theorem 5.6 (A. A. Ivanov 1983). If a distance-regular graph has degree k then its 
diameter is d <t- 4*. 


This result is valid for infinite graphs as well, implying that in that case ¢ = 
oo, hence the graph is an r-tree of s-cliques for some r,s > 2, thus extending 
Macpherson’s theorem to distance-regular graphs. 

Returning to finite graphs, it is shown in Brouwer et al. (1989, p. 220) via Weiss’ 
proof, that the diameter of a distance-transitive graph of degree k is d < (k®)!4*. 
In reality, d < 8 for k = 3, and d < 2k — 1 in all known cases for k 2 4. 
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We mention two more parameter bounds. Godsil (1988) proves that if a distance- 
regular graph X has an eigenvalue of multiplicity f > 3 then either X is complete 
multipartite or X has diameter d <3f—4 and degree k < (f —1)(f +2)/2. The 
dodecahedron attains the diameter bound; the icosahedron attains the valency 
bound. 

Using CFSG through the list of doubly transitive groups, Weiss (1985a) classifies 
the s-arc-transitive graphs of girth g < 2s +2 (s > 4). (Note that g > 2s — 2 always; 
and g < 2s + 2 holds for all distance-transitive graphs.) As a corollary, he finds all 
distance-transitive graphs of degree k > 3 and girth g > 9. In addition to the two 
largest trivalent distance-transitive graphs (the Biggs-Smith graph on 102 vertices 
(g = 9) and the Foster graph on 90 vertices (g = 10), he finds an infinite sequence of 
graphs with g = 12, the incidence graphs of the generalized hexagons of associated 
with the Chevalley groups G2(q), g a power of 3. 


Distance-transitive digraphs are considered by Bannai, Cameron and Kahn 
(1981). 


5.3. Homogeneity 


In this section we consider a very strong symmetry constraint, the study of which 
has led to powerful applications of group theory to model theory. A deeper survey 
is Lachlan (1986); Kantor, Liebeck and Macpherson (1989) is accessible to the 
reader less versed in model theory. 

We shall consider finite and countably infinite graphs, digraphs, and other struc- 
tures. 

A graph X is homogeneous if every isomorphism between finite induced sub- 
graphs extends to an automorphism of X. Homogeneous digraphs, hypergraphs, 
etc. are defined analogously. Clearly, the complement of a homogeneous graph is 
again homogeneous. 

Gardiner (1976) showed that the only finite homogeneous graphs are m- K,, 
(the disjoint union of cliques of equal size), their complements, L(K33), and the 
pentagon. The finite homogeneous tournaments are just the single point and C; 
(the directed 3-cycle) (Woodrow 1979). The list of finite homogeneous oriented 
graphs (digraphs with no 2-cycles) is the following: the single Point, C;, C3[Kn| 
(lexicographic product, section 2), m- C3 (m copies of C3); C;, and finally the 
Cayley digraph of the quaternion group Qg with respect to the generating set 
{i, j,k} in the usual notation (Lachlan). 

Cameron 1980a (cf. Cameron, Goethals and Seidel! 1978) and Gel’fand (unpub- 
lished) strengthened Gardiner’s result considerably by relaxing the homogeneity 
condition. We call the graph X k-homogeneous if isomorphisms of subgraphs of 
< k vertices extend to automorphisms. 


Theorem 5.7 (Cameron, Gel’fand). If X is a 5-homogeneous finite graph then X 
appears on Gardiner’s list; and therefore X is homogeneous. 


Actually, the result of Cameron and Gel’fand is even more general in that they 
replace the symmetry condition by a regularity condition: X is k-regular if any 
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two isomorphic induced subgraphs of < k vertices have the same number of com- 
mon neighbors. Observe that “{-regularity” means X is regular; and “2-regularity” 
means X is strongly regular. These conditions do not imply the presence of any au- 
tomorphisms and allow a great variety of examples. This fact is in contrast with the 
situation for k > 5: If the finite graph X is 5-regular then it appears on Gardiner's 
list (Cameron, Gel’fand). 

The following generalization allows us to bring graphs of diameter greater than 
2 into the picture. Let us call a graph X metrically k-transitive if any X-distance 
preserving map between ordered k-tuples of vertices of X extends to an auto- 
morphism of X. Note that for k = 1 this is vertex-transitivity, and for k = 2 it is 
distance-transitivity. We also note that the neighborhood of a vertex in a met- 
rically k-transitive graph is (k — 1)-homogeneous. Building on this fact and on 
Theorem 5.7, Cameron classifies all finite metrically 6-transitive graphs. The con- 
nected ones are the complement of m- K,,, K,, with a perfect matching deleted, 
the cycles, L(K3 3), the icosahedron, and the graph J (6, 3) on 20 vertices identified 
with the set of 3-subsets of a 6-set; two vertices are adjacent if the corresponding 
3-sets share two elements. It follows that these graphs are automatically metrically 
k-transitive for every k. 

Now we turn to the countably infinite (countable for short) case. The best known 
example is the Rado graph, or “generic countable graph”, characterized by the fol- 
lowing property: given any two disjoint finite subsets A and B of the vertex set, 
there exists a vertex adjacent to all vertices in A but none in B. This property 
determines a unique countable graph. The Rado graph contains all finite graphs as 
induced subgraphs. A countable random graph (each pair is adjacent with prob- 
ability 5 independently) has probability 1 to be isomorphic to the Rado graph 
(Erdés and Rényi 1963). 

In addition, for every m there exists a unique “generic countable graph without 
K,, subgraphs”, &,,. In this classification, the Rado graph is %,. Lachlan and 
Woodrow (1980) show that the %,, (3 < m < 00) and their complements exhaust 
all nontrivial examples of countable homogeneous graphs; the trivial ones are 
disjoint unions of cliques of equal size and their complements. 

The Rado graph has an obvious tournament analogue, the “generic tourna- 
ment”. Lachlan showed that there arc only two other countable homogeneous 
tournaments: the dense linear order (the order-type of the rationals), and the 
dense circular order. The latter is defined by a countable dense set on the unit 
circle with no pairs of antipodal points; edges correspond to clockwise walks along 
the shorter of the two arcs joining a pair of points. 

Homogeneous partial orders were classified by Schmer! (1979); a countable num- 
ber of them was found. In contrast to these results, Henson (1972) found continuum 
many nonisomorphic countable homogeneous oriented graphs. Notwithstanding, 
Cherlin (1987) classified all the homogeneous oriented graphs. 

Model theorists’ interest in homogeneous structures dates back to a 1954 paper 
of Fraissé (1954) linking homogeneity, categoricity, and quantifier elimination. 

Let us consider a locally finite “language”, i.e. a set L of relation symbols, each 
associated with a positive integer called the arity such that each arity occurs a 
finite number of times. An L-structure A is a set M endowed with a relation of 
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appropriate arity for each symbol in L. (A k-ary relation is a subset of M*. We 
allow the case M - @.) Graphs, digraphs correspond to the language of a single 
binary relation. Every subset of M induces a substructure. (We use the term “sub- 
structure” to mean induced substructure.) is homogeneous if all isomorphisms 
of finite substructures extend to automorphisms of . The theory Th() consists 
of all first-order sentences which are true in 4. The theories of homogeneous 
L-structures are precisely those which permit quantifier elimination (first-order 
statements of the form y(u,...,4,) depend only on the substructure induced by 
Upy-.-y ux). 

Let F(M) be the class of structures isomorphic to finite substructures of M. 
A class @ of L-structures is hereditary if it is closed under taking substructures. 
€ has the amalgamation property if, whenever ¥%, #,, F € € and g;:Fy — F; are 
embeddings (isomorphisms onto substructures) (i = 1,2), there exist ¥, € € and 
embeddings f;:F; - F, (i = 1,2) such that g,f; = gofo. (Note especially that we 
allowed the case Fy = @, thus taking care of what logicians call the “joint embed. 
ding property”.) An isomorphism-closed class € of finite L-structures is called an 
amalgamation class if it is hereditary and has the amalgamation property. 


Theorem 5.8 (Fraissé). If A is a countable homogeneous L-structure, then F¥(M) is 
an amalgamation class. Conversely, every amalgamation class of finite L-structures 
is #(M) for a countable homogeneous L-structure M, unique up to isomorphism. 


The construction of /f in the second statement is a direct limit argument. Since 
the class of finite graphs without K,, is clearly an amalgamation class, the generic 
graphs &,, of the Lachlan—Woodrow theorem are uniquely determined. 

A countable structure “ is Ny-categorical if (up to isomorphism) is the only 
countable model of its theory. Every countable homogeneous structure is Xo- 
categorical. (The converse is false.) Xy-categoricity depends solely on Aut(M). The 
k-types of a structure MM are the orbits of Aut() on M*. 


Theorem 5.9 (Ryll-Nardzewski, Engeler, Svenonius). A countable structure M is 
Ny-categorical if and only if it has a finite number of k-types for every finite k. 


This result, in a sense, reduces the study of Ny-categorical structures to the study 
of oligomorphic permutation groups (groups which have a finite number of orbit: 
on k-sets for every k; see chapter 12, section 9.5, cf. Cameron 1990). Oligomorphic 
groups are precisely the dense subgroups (w.r. to pointwise convergence) in the 
automorphism groups of Xy-categorical structures over locally finite languages. 

N is a smooth substructure of M if N is a substructure and (i) all automorphism: 
of NW extend to M; and (ii) for each k, two k-tuples u,v € N* belong to the same 
k-type of W if and only if they belong to the same k-type of M. 

An Np-categorical L-structure is smoothly approximable if it is the union of 
a chain of finite smooth substructures. The “trivial examples” in the Lachlan- 
Woodrow theorem, i.e., the disjoint unions of complete graphs and their com 
plements, are smoothly approximable. By Theorem 5.9, the approximating finit« 
structures must have a bounded number of k-types for every fixed k. 
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In one of the most exciting developments in model theory recently, com- 
bined work of Cherlin, Lachlan, Harrington, Kantor, Liebeck, Macpherson, and 
Hrushovski (Cherlin and Lachlan 1986, Cherlin et al. 1985, Kantor et al. 1989, 
Cherlin and Hrushovski 1994), heavily relying on CFSG, has led to the classifica- 
tion of all finite L-structures with a bounded number of 5-types. (The same magic 
number 5 as in Theorem 5.7.) 

Let €(L,k) denote the class of L-structures with at most k 5-types. The final re- 
sult is that €(L, k) can be decomposed into finitely many classes and each class has 
a simple dimension theory: a finite number of dimensions is identified, and each 
first-order statement is equivalent to a Boolean combination of finiteness and ex- 
act value statements of each dimension. The dimensions can be varied essentially 
independently. Dimensions correspond to classes of Lie geometries; the classical 
examples of the latter are linear and projective spaces over finite fields, possi- 
bly with forms (symplectic, orthogonal, unitary), and Grassmannians over disjoint 
unions of these. Pure sets occur as degenerate examples. The @th Grassmannian 
over the geometry @ is the orbit of an £-dimensional subgeometry of @ under 
Aut(4). (When @ is the disjoint union of ¢ pure sets each of size m, the th Grass- 
mannian is the association scheme defined by the natural action of S,, 0S, on a set 
of size n = (")'.) 

The proof uses full force of CFSG through the structure theory of primitive 
permutation groups (O’Nan-Scott theorem, cf. chapter 12), including recent work 
of Aschbacher and Licbeck on maximal subgroups of classical groups. 

A corollary of this theory is that for every finite language L and every fixed k, 
membership in €(L,k) can be tested in polynomial time. 

Another curious corollary is the following. Let us say that the graph X has the 
m-extension property if for any two disjoint subsets A, B of the vertex set there 
exists a vertex adjacent to all vertices in A but none in B, assuming |A] + |B| < m. 
(The Rado graph has this property for all m. Almost all graphs on n vertices 
have the m-extension property for m = (1 — «) log, n; and the Paley graph P(q, 2) 
(section 1.1) for m = ( 5 — €)log, q (Bollobds 1985, chapter 13.2.)) 


Corollary 5.10 (Cherlin and Hrushovski 1994). /f for every m, X,,, is a finite graph 


with the m-extension property then the number of orbits of Aut(X,,) on 5-tuples of 
vertices is unbounded (as m — oo). 


It would be desirable to see a proof of this result which does not require CFSG. 

A final note on higher cardinals: Kierstead and Nyikos (1989) characterize those 
n-uniform hypergraphs of cardinality x which have a finite number of isomorphism 
types of induced subhypergraphs of cardinality A for some infinite A < x. 


6. Graph isomorphism 


Deciding whether or not two explicitly given finite algebraic or combinatorial struc- 
tures are isomorphic has been a long-standing unsolved question in the theory of 
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computing. Since all such structures can be canonically encoded in polynomial 
time by graphs (Hedrlin and Pultr 1966, Miller 1979), it would suffice to solve it 
for graphs. 

From a practical point of view, backtrack algorithms perform quite well. The 
leader in the trade is McKay’s (1987) program “Nauty”. However, in spite of 
considerable effort, the theoretical complexity status of graph isomorphism is still 
unresolved. 


6.1. Complexity theoretic remarks 


For basic concepts of computational complexity theory we refer to chapter 29; see 
also Garey and Johnson (1979). 

While “graph isomorphism” (the set of pairs of isomorphic graphs) clearly be- 
longs to NP, it is not known to belong to coNP. In other words, it is not known 
whether or not for all pairs of nonisomorphic graphs, a short (polynomial length) 
proof of nonisomorphism exists. It is known, however, that nonisomorphism has 
bounded round interactive proofs (Goldreich et al. 1986), a fact that puts “noni- 
somorphism” in the class AM, a randomized extension of NP. This is considered 
strong theoretical evidence against NP-completeness of “graph isomorphism”; if 
it were NP-complete, the “polynomial time hierarchy”, a hierarchy of complexity 
classes between P and PSPACE, would collapse. For further references, see Babai 
and Moran (1988) (cf. chapter 29). 


6.2. Algorithmic results: summary of worst case bounds 


The best current worst-case bound for a general graph isomorphism algorithm is 
exp ,/cnlogn for n-vertex graphs (Luks and Zemlyachenko, cf. Babai and Luks 
1983, and Babai, Kantor and Luks 1983). For some special classes of graphs, 
substantially better results are available. For groups given by their multiplica- 
tion tables, and for Steiner triple systems, °°?) isomorphism tests easily follow 
from the observation that these structures have generating sets of size < logn. For 
planar graphs, ingenious use of stacks has resulted in a linear time isomorphism 
test (Hopcroft and Tarjan 1972, Hopcroft and Wong 1974). Combinatorial meth- 
ods in a similar spirit yiclded polynomial time isomorphism tests for graphs of 
bounded genus (n!®) time for genus g > 1) (Filotti and Mayer 1980). Group the- 
oretic methods led to polynomial time algorithms for graphs with colored vertices 
and bounded color-classes (isomorphisms preserve colors by definition) (Babai 
1979c), for graphs with bounded multiplicity of eigenvalues (Babai, Grigoryev and 
Mount 1982), and, with considerably deeper use of group theory, for graphs of 
bounded degree (Luks 1982) (n° time for graphs of degree <d (Babai and 
Luks 1983); O(n’ log) time for trivalent graphs, Galil et al. 1987). As a conse- 
quence of Luks’ methods, isomorphism of block designs (BIBDs) with bounded k 
and A can be tested in time n 2") (Babai and Luks 1983) (k is the block size and 
there are A blocks common to each pair of vertices); isomorphism of tournaments 
can be tested in time 28") (Babai and Luks 1983); and isomorphism of A-planes 
(symmetric designs) with bounded A in n°e2!2") time (Babai and Luks 1983). 
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A common generalization of the polynomial time results for bounded degree and 
bounded genus was obtained by Miller (1983a,b). 
Luks’ beautiful paper (Luks 1982) is the single most fundamental reading in the 


area. It introduces the profound links to group theory to be discussed in section 
6.6. 


6.3. Canonical forms 


An algorithmic problem closely related to graph isomorphism is the problem of 
complete invariants and in particular of canonical forms of graphs. Let # denote 
a class of objects with an equivalence retation to be called “isomorphism”. An 
invariant on XH is a mapping f from 3 to some class £ of objects such that 
whenever X,Y € X are isomorphic, f(X) = f(Y). We call f a complete invariant, 
if the converse also holds: f(X) = f(Y) implies X = Y. Mf, in addition, £ = % and 
f(X) =X for every X € X then the complete invariant f is called a canonical 
form over H; and f(X) the canonical form of X. For graphs, a canonical form 
f assigns a labeling to the vertices, and this assignment is uniquely defined by f 
up to automorphisms of X. We call such a labeling canonical, although strictly 
speaking it is the coset of the automorphism group consisting of all the labelings 
corresponding to f which is canonical. 

Clearly, if a canonical form for a class of objects is available, then isomorphism 
testing is accomplished by simply comparing the canonical forms. The converse 
is not known to be true, but in all classes listed above, canonical forms can be 
obtained within the same time bound as guaranteed for isomorphism testing (cf. 
Babai and Luks 1983). 

An important invariant of graphs is the characteristic polynomial of their adja- 
cency matrix. This invariant fails to be complete (quite badly, cf. Corollary 1.14), 
as do all other known polynomial time computable invariants. 

An example of a canonical form of a graph is the one which produces the 
lexicographically first adjacency matrix. While this is clearly a complete invariant, 
unfortunately it ts NP-hard to compute (reduction from maximum clique). 


6.4. Combinatorial heuristics: success and failure 


Testing graph isomorphism is easily seen to be equivalent to determining the orbits 
of the automorphism group of a graph. It is therefore natural to try to find invariant 
colorings of the vertex set V(X) (i.e. each color class should be a union of orbits 
of Aut(X)), and refine the color partition in the hope that eventually we obtain 
the orbit partition. An ordered partition (C),...,C,,) of V(X) into invariant color 
classes C; can be refined in a simple way: with each vertex v € C;, we associate the 
list (i, Bi, .--, Bn), where B; denotes the number of neighbors of v in C;. Now order 
these lists lexicographically; vertices with the same list receive the same color in 
the new coloring. (The first round colors the vertices by their degree.) Eventually 
the process stops at a stable coloring, characterized by the fact that for every i, j 
all vertices in C; have the same number of neighbors in Cj. 

Let F denote the class of graphs which are partitioned by this process into 
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singletons. Clearly, these graph have no automorphisms other than the identity, 
and the refinement process results in a unique canonical labeling of the graphs 
belonging to F a 

This naive method is highly successful on average: all but an exponentially small 
fraction of the graphs on n vertices are partitioned into singletons in the third 
round (and thus in linear time) (Babai and Kuéera 1979). This is a constructive 
version of the Erdés-Rényi theorem that all but an exponentially small fraction of 
the graphs are asymmetric (section 1.6). 

Perhaps even more surprising is the result of Kuéera (1987) that a modified 
procedure yiclds a unique canonical labeling of almost all trivalent graphs (and of 
graphs of bounded degree) in linear time. One of the difficulties in handling regular 
graphs in linear time is how to achieve an initial coloring at all. Kutera achieves 
this by considering the shortest cycles. 

If we allow more time, a simpler way would be to individualize a vertex, i.e. 
to assign a unique color to it, thereby creating a nontrivial initial coloring. Even 
if subsequent refinements lead to complete partitioning into singletons, we still 
have to repeat the procedure for every vertex, thereby losing a factor of 1 in time. 
One can also individualize a set of k vertices at once (giving each of them distinct 
colors), thereby increasing the running time by a factor of n*. 

This combination is shown in Babai (1980b, 1981c) to succeed for strongly regu- 
lar graphs as well as for primitive coherent configurations with k < 4,/nlogn (see 
chapter 41, section 4). 

A stronger refinement procedure was proposed in 1968 by Weisfeiler and 
Lehman (see Weisfeiler 1976): they suggested to color the set of ordered pairs 
of vertices. Given an ordered partition V x V = C,; x --- x Cy, into color classes 
C;, we associate with each pair (u, v) of vertices the list (7, Bix: 1 < j,k < m), where 
(u,v) € C; and Bj, counts those vertices w with (u,w) € C; and (w,v) € C,. Now 
again order these lists lexicographically to obtain a refined coloring of V x V. The 
initial coloring of V x V uses 3 colors: edges, non-edges, and the diagonal. 

The class of graphs for which no refinement is obtained is the strongly regular 
graphs. \n general, the stable partitions for the Weisfeiler--Lehman procedure are 
precisely the coherent configurations (chapter 15, section 3). 

One can generalize the Weisfciler-Lehman procedure to partitioning the set 
V“ of ordered d-tuples in an analogous way. The stable configuration obtained is 
canonical and the question is, for what d is the resulting partition of the diagonal 
necessarily the orbit partition of the vertex set. Such a d would yield a canonical 
form computable in O(n“*') time. 

The Cameron—Gel fand theorem (Theorem 5.7) implies that for d 2 5, at least 
one nontrivial partition occurs in all cases except for the unions of complete 
graphs of equal size and the complements thereof. The result Babai (1980b) men- 
tioned above implies that d = O(,/n log n) completely succeeds for strongly regular 
graphs. 

Yet a surprising negative result of Cai, Fiirer and Immerman (1992) dashed 
the hopes for a purely combinatorial isomorphism test in moderately exponential 
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(exp(m'~°)) time. They construct a pair of nonisomorphic graphs which force d = 
AQ(n) in order for the Weisfeiler-Lehman procedure for d-tuples to distinguish them. 

Their counterexample still leaves ample room for a combination of combinato- 
rial and group theoretic methods to work. Their graphs are partitioned into vertex 
classes of size 4, and, as mentioned before, the simplest group theoretic method, 
based on Babai (1979c), yields canonical forms for graphs with bounded color 
classes in polynomial time. 

We should mention that the current best timing for isdniSiphish testing and 
canonical forms for general graphs, exp(O(,/nlogn)), is obtained by combining 
Luks’ group theoretic method with a combinatorial trick of Zemlyachenko (see 
Zemlyachenko et al. 1985) (cf. Babai 1981a). Since Zemlyachenko’s method does 
not apply for instance to 3-uniform hypergraphs, the best bound for isomorphism 
testing within this class is C” (Luks, cf. Babai and Luks 1983). 


6.5. Reductions, isomorphism complete problems, Luks equivalence class 


The graph isomorphism problem (ISO for short) is polynomial time equivalent to 
the isomorphism problem for directed, vertex and edge-colored graphs (isomor- 
phisms preserve colors by definition), and more generally to explicit structures 
with a set of relations of arbitrary arities. This can be proven by the method of 
encoding colors into gadgets as in Frucht’s theorem (cf. Hedrlin and Pultr 1966, 
Miller 1979). A number of restricted classes @ are known to be isomorphism com- 
plete, i.e. ISO can be reduced to isomorphism within €. These include commutative 
semigroups, k-connected regular bipartite graphs with or without Hamilton cycles, 
graphs with large girth and chromatic number, etc. Exceptions are those classes 
which are known to have subexponential (exp(n))) isomorphism tests (groups, 
latin squares, tournaments, polynomial time testable classes), as well as strongly 
regular graphs. 

The following problems are also known to be equivalent to ISO (see 
Mathon 1979). Given a graph, determine (i) the orbits of Aut(X); (ii) generators 
of Aut(X); (iii) (Babai and Mathon) the order of Aut(X). 

Observe that (li), if applied to the union of a pair of ee connected 
graphs, yields an isomorphism. 

Luks found another, related equivalence class of group thepielie problems. Let 
G,H < Sym(2) be permutation groups given by a list of generators. The following 
problems are polynomial time equivalent: (a) find (generators for) GN H (group 
intersection); (b) given an element a € Sym(Q), decide whether or not GN Ho = 6 
(coset intersection); (c) given a subset A C 92, find the set-stabilizer of A in G; (d) 
given A C and o € Sym({2), decide whether the set-stabilizer of A intersects the 
coset Ga; (e) given a, 7 € Sym(Q), decide whether or not o belongs to the double 
coset GrH; (f) givens € G, find the centralizer of 7 in G; (g) given a, 7 € G, decide 
whether or not the centralizer of + in Sym(2) intersects Go. 

(Note that if “set-stabilizer” is replaced by “pointwise set stabilizer” in problems 
(c) and (d), they become polynomial time solvable.) 


Proposition 6.1 (Luks). /SO reduces to coset intersection in polynomial time. 
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For simplicity we prove instead, how to reduce the determination of Aut(X) to 
group intersection. Let X = (V,£) be a graph and let Q-be the set of unordered 
pairs from V. Let G < Sym(Q) denote the induced action of Sym(V) on pairs; and 
let H = Sym(E) x Sym(2 \ E) < Sym(2) be the set stabilizer of E in Sym(2). 
Then obviously, the induced action of Aut(X) on QisGNH. O 


It is significant that there is strong theoretical evidence suggesting that the deci- 
sion problems in the Luks equivalence class ((b), (d), (e), (g)) are not NP-complete 
(Goldreich et al. 1986, Babai and Moran 1988). If any of these problems (and 
therefore each of them) were NP-complete, this would imply the collapse of the 
“polynomial time hierarchy” in complexity theory, just as NP-completeness of ISO 
would (cf. section 6.1). 

Even more significantly, subcases of ISO can be reduced to polynomial time 
solvable subcases of coset intersection, and thereby they become polynomial time 
solvable themselves. This is one of the fundamental observations in Luks’ (1982) 
seminal paper. 


6.6. Groups with restricted composition factors 


In this section, we sketch the proof of the main result of Luks (1982). 


Theorem 6.2 (Luks). [somorphism of graphs of bounded degree can be tested in 
polynomial time. 


Recall that we used I”, to denote the class of groups with a chain of subgroups 
G = Go 2 +++ 2 Gm =1 such that |G;_.;:G;| < d. This is the class of groups which 
occurs as edge-stabilizers in connected graphs of degree < (d+1) (Theorem 4.4 
(c)). 

Using the trivial direction of this characterization, Luks reduced isomorphism 
of graphs of degree < (d+!) to set stabilizers within a coset Go (G < Sym(2), 
ao € Sym(2)), where G € [, and G is given by a list of generators. Next, he solved 
the latter problem in polynomial time, inventing a permutation group version 
of the classical algorithmic technique of “divide and conquer”. The idea is to 
solve the problem one orbit at a time, reducing to a sub-cosct in cach round. 
For transitive G, we break G into blocks of imprimitivity; let N be the stabi- 
lizer of a system of maximal blocks. Now G/N acts as a primitive group on 
the blocks. Go is the union of |G/N| cosets of N, and we solve the problem 
separately inside each coset. Formally, fix A C 2, and for any G-invariant set 
BC {2 let €(B, Go) = {7 € Go: (AN B)” = ANB}. This set is either empty or 
a coset of a subgroup of G. The identity €(B, UB), Go) = ©(B,, €(B2, Ga)) is 
used to reduce to the transitive case. For H < G we have G = |J; H7; and thus 
€(B, Go) =|) €(B, H7,c); setting H = N, this can be used to reduce the imprim- 
itive case to the intransitive case. 

The algorithm runs in polynomial time because of the following result. It is easy 
to see that I, can be characterized as the class of groups of which each composition 
factor is a subgroup of the symmetric group Sy. 
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Theorem 6.3 (Babai, Cameron and Palfy 1982). Let G < S, be a primitive group 
of degree n and assume G € Ty. Then |G| <n where c is an absolute constant. 
More generally, if all alternating composition factors of G have bounded orders and 
all classical groups among the composition factors of G have bounded dimensions 
then |G| < n© for some constant C depending only on the bounds in the condition. 


Note that in particular, primitive solvable groups have order < n°, where c = 
3.24399... (Palfy 1982, Wolf 1982). 

Turning back to Luks’ algorithm, Theorem 6.3 guarantees that |G/N] is poly- 
nomially bounded, allowing a recurrence in timing with polynomially bounded 
solution, completing the proof of Theorem 6.2. (We note that Theorem 6.3 was 
not available to Luks at the time; instead, in the difficult affine case, he used the 
second reduction step above with H a Sylow p-subgroup which he showed had 
polynomially bounded index.) © 


Isomorphism of tournaments can be decided in n°"°2") time (Babai and Luks 
1983). This algorithm uses the Pdlfy-Wolf bound on primitive solvable groups 
(above) (and the Feit-Thompson theorem through the solvability of the automor- 
phism groups of tournaments). 


6.7. Basic permutation group algorithms 


We assume in this section that a permutation group G < Sym() is given by a set § 
of s generators; || = n. Some of the basic algorithmic problems to solve are testing 
membership in G of a given o € Sym({2); determining the order of G; constructing 
the normal closure of a subgroup (also given by a list of generators). Once these are 
solved, solvability and nilpotence of G are easily decided. In his pioneering work 
in computational group theory, Sims (1970, 1971, 1978) constructed algorithms for 
these problems which ran fast in practice and were later asymptotically analysed 
to run in polynomial time in the worst case (see below). 

Theory and practice diverge in the areas of more advanced problems, including 
determining the center, the composition factors, the Sylow subgroups. All these 
problems are now solvable in polynomial time. The elegant construction of a com- 
position chain and the composition factors (Luks 1987) uses the O’Nan-Scott The- 
orem (chapter 12) and requires CFSG (the classification of finite simple groups) 
through Schreier’s hypothesis (the outer automorphism group of a simple group is 
solvable). Beals (1993b) has recently found an elementary algorithm for compo- 
sition factors. Kantor’s (1985a,b) construction of the Sylow subgroups starts with 
finding a composition chain via Luks (1987) and rests on detailed knowledge of 
CFSG and a case-by-case study of the classical groups. Luks’ (1987) algorithm to 
find the center is elementary. 

Many other important problems are not known to be solvable in polynomial 
time, and in fact often they are at least as hard in general as graph isomorphism 
(centralizers, intersections, cf. section 6.5). Particularly efficient backtrack proce- 
dures have recently been found and implemented by Leon (1991), using partition-. 
ing heuristics (cf. section 6.4). Such procedures are often used even for problems 
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solvable in polynomial time (e.g., finding the center by repeated application of a 
backtrack routine for centralizers), showing a discrepancy between theoretical and 
practical measures-of efficiency. : : 

For the rest of this section we return to the complexity analysis of the basic 
problems. Given a chain G = Gp > G; >--- > G», = 1 of subgroups, a strong gen- 
erating set (SGS) with respect to this chain is a set T C G such that (T 9 G;) = G; 
for every i. This concept was introduced by Sims (1970) (with respect to the stabi- 
lizer chain) as the fundamental data structure for permutation group algorithms. 
(Recent algorithms often operate on different chains of subgroups; however, it is 
possible to switch efficiently from any SGS to one in Sims’ sense, Cooperman et 
al. 1990.) Given an SGS, the problems of membership and order can be solved 
easily, a presentation (in terms of generators and relations) can be deduced, and 
slight variations of the SGS methods yield normal closures as well. Variants of 
Sims’ method have been shown to run in polynomial time (O(n° + sn?) (Furst, 
Hopcroft and Luks 1980) and O(#° 4 sm’) Knuth 1991, Jerrum 1986). These ele- 
mentary algorithms require {2(°) even on average on large classes of examples 
(Knuth 1991). 

Better asymptotic bounds have been obtained using heavy guns. For two func- 
tions f,g let us write f(n) = O~(g(n)) if for sufficiently large n, f(n) < g(n) log‘ n 
for some constant c. With this notation, the best current deterministic asymptotic 
worst case bound is O~(sn*) (Babai, Luks and Seress 1993). This bound depends on 
CFSG primarily through estimates of the orders of primitive permutation groups 
(chapter 12, Theorem 5.8, cf. Cameron 1981). With randomization we can do con- 
siderably better and have an entirely elementary O~ (n° + sn) Monte Carlo algo- 
rithm to construct an SGS (Babai, Cooperman, Finkelstein, Luks and Seress 1991). 
(Being Monte Carlo, the algorithm does not guarantee to construct an SGS but 
it does so with arbitrarily large probability.) The algorithm includes a particularly 
efficient normal closure routine, running in O~(n? + sn). The basic technique of 
the algorithm generalizes the following observation: Ler g),...,g, € G generate G 
and let H be a proper subgroup of G. Then the probability that h ¢ H for a ran- 
dom subproduct h = g{'- +g; is > §. (The «; € {0, t} are selected by independent 
unbiased coin-flips.) 

A base of G is a set B © £2 such that the pointwise stabilizer of B in G is 
the identity. Let ~(G) be the minimum size of a base. The case of small y(G) 
is of particular interest. For instance, if G is simple non-alternating then 4.(G) = 
O(logn). It is easy to see that 24 < |G <n", Let us say that a class @ of 
groups has small bases if (G) = (logn)° for G € Y. Sims style algorithms run 
in O-(sn?) on groups with a small base. Using new combinatorial techniques, 
elementary Monte Carlo algorithms have been found which construct an SGS in 
nearly linear, O~(sn) time for small base groups (Babai, Cooperman, Finkelstein 
and Seress 1991). The speedup relies on methods capable of handling chains of 
certain subsets of G which are not subgroups; the subgroup structure of small basc 
groups tends to be too coarse to allow nearly linear time. The key new ingredients 
are an efficient implementation of Sims’ Schreier vector data structure to store 
coset representatives in a shallow tree (depth guaranteed to be < log |G]) via an 
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algorithmic version of the Reachability theorem (Theorem 6.4); and the use of the 
Local expansion property (Theorem 3.41) to rapidly locate new elements if the 
current partial SGS misses a substantial portion of G. 

Finding domains of imprimitivity seems indispensable when delving deeper into 
the group structure. Atkinson’s (1975) algorithm finds them in quadratic time. For 
small base groups, Beals (1993a) improved this to nearly linear time, and in a tour 
de force, used this with Seress to find a composition series in nearly linear time 
(Beals and Seress 1992). 

A final note on parallelization. An NC algorithm uses n°") parallel processors 
and extremely short, (log7)° time, where n is the length of the input. (So nonc 
of the processors has time to read any substantial portion of the input; cf. chapter 
29.) Radical departure from the classical methods has allowed the design of an 
NC algorithm to construct an SGS and solve some of the basic problems in NC, 
including membership, order, normal closures, solvability, center, composition fac- 
tors (Babai et al. 1987). Again, the algorithm uses CFSG mainly through Theorem 
5.8 of chapter 12, and also requires Luks’ composition factors algorithm. The al- 
gorithm digs deeply into the normal structure of G. Even the rudimentary task of 
membership testing requires determining the composition series first. 


6.8. Complexity of related problems 


Problems related to graph isomorphism (“ISO” for short) and permutation group 
membership fall into a variety of complexity classes. Groups, semigroups will be 
given by a list of generators, unless otherwise stated. 

A surprising result of Lubiw (1981) asserts that the following problem is NP- 
complete: Does a given permutation group have a fixed-point-free element? Even 
the case when G ts an elementary abelian 2-group is NP-complete. Lalonde (1981) 
used this to show the following problem NP-complete: Does a given bipartite graph 
have an automorphism of order 2 interchanging the two color classes? \n contrast, 
if we omit the “order 2” restriction, the problem becomes isomorphism complete 
(equivalent to ISO). The original (equivalent) statement of Lalonde’s theorem is 
this: The star system problem is NP-complete. The “star system problem” has a 
family ¥ of n subsets of an n-sct V for input and asks if there exists a graph 
NX = (V,£E) such that ¥ = {X(v): uv € V} is the family of vertex neighborhoods in 
Xx. 

Isomorphism of groups of order n, given by their Cayley tables, can be decided 
in time n'°:*O()) because the groups are generated by < log, n elements and any 
mapping of the generators can be extended to a homomorphism in at most one 
way. This argument generalizes to quasigroups which in turn include Steiner triple 
systems. 

To decide isomorphism of permutation groups is at least as hard as ISO 
(Babai, Kannan and Luks 1994). On the other hand this problem is in NP for 
the following simple reason: Let G = (S) < Sym(A) and H = (T) < Sym(B) be 
permutation groups and f:S — Sym(B) a map. Then f extends to an isomorphism 
of G onto H if and only if the following two polynomial time testable conditions 
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hold: (i) H is generated by the f-image of S; (ii) the orders of G, H, and the 
group ((s, f(s)): s € S) agree. On the other hand, isomorphism of permutation 
groups also belongs to the class coAM (Babai, Kannan and Luks 1994) (cf. section 
6.1) and is therefore unlikely to be NP-complete. 

If G,H, K < Sym(A) and o € Sym(A) then the double coset membership prob- 
lem “o € GH?” belongs to the Luks equivalence class (is equivalent to coset in- 
tersection) (section 6.5). On the other hand, the question “ao € GHK?” is NP- 
complete (Luks). 

The membership problem for semigroups of transformations of a finite set is 
PSPACE-complete (Kozen 1977). 

The membership problem for d x d integral matrices is undecidable already for 
d = 4. This is immediate from the following result of Mihailova (1958): The mem- 
bership problem is undecidable for subgroups of F, x /), where F; is the free group 
of rank 2. However, finiteness of an integral matrix group (or a matrix group over 
an algebraic number field) can be decided in polynomial time (Babai et al. 1993), 
and if the group is finite, the usual basic questions (order, center, composition 
chain, Sylow subgroups) can be answered in Las Vegas polynomial time (Beals 
and Babai 1993). (A Las Vegas algorithm uses randomization but never outputs a 
wrong answer.) 

For finite groups, the membership problem is in NP under quite general con- 
ditions. A black box group is, informally, a group whose elements are encoded 
by (0,1)-strings of uniform length, and the group operations are performed by a 
“black box”. (As all our groups, a black box group is given by a list of generators.) 
Then membership is in NP, relative to the black box. In particular, membership in 
matrix groups over finite fields is in NP. This is immediate from the following com- 
binatorial result. A straight line program reaching a group element g € G from a 
set S of generators of G is a sequence g1,...,8 of elements of G such that g,, = g, 
and for each i, either g; € S, or gi = 8; "or 8 = 2)8% for some j,k < i. The cost of 
such a program is the number of inversions and multiplications (the calls to S are 
free). The straight line cost of g € G (relative to S) is the minimum cost of straight 
line programs reaching g from S. 


Theorem 6.4 (Reachability theorem, Babai and Szemerédi 1984). Given any set S 


of generators of a group G of order n, the straight line cost of any g & G is less than 
(1 + log, n)?. 


We conjecture that membership in matrix groups also belongs to coNP. The proof 
of this statement and the stronger statement that the order of a matrix group over 
a finite field belongs to NP (i.e. the correct order has polynomial time verifiable 
certificates) depends, in essence, on the following conjecture. 


Short presentation conjecture. Every group of order n has a presentation (in 
terms of generators and relations) of length (log), 


(The length of a presentation is the total number of characters required to write 
down the presentation.) It follows from Theorem 6.4 that it suffices to prove this 
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conjecture for simple groups. All cases have been confirmed with the exception 
of the rank 1 simple groups of twisted Lie type (unitary, Suzuki, Ree) (Babai, 
Goodman, Kantor, Luks and Palfy 1994). 

None of the problems mentioned in this section, with the possible exception of 
isomorphism of groups given by a Cayley table, is expected to have polynomial 
time solution. In particular, the membership problem for 1 x 1 matrix groups is a 
close relative of the discrete logarithm problem (given a, b € GF(q), fiad an integer 
x such that a* = b or decide that no such x exists) which is not expected to be 
solvable in polynomial time (cf. Adleman and Demarrais 1993). 

Modulo this obstacle, however, a great deal of stucture can be found in matrix 
groups and even in black box groups (Beals and Babai 1993). 


7. The reconstruction problem 


All graphs in this section are finite unless otherwise stated. 

In the Introduction to this chapter we gave a general definition of reconstructibil- 
ity; and discussed a number of instances. Examples include Whitney’s theorems on 
the reconstructibility of graphs from their line graphs (with known exceptions) (sec- 
tion 1.2), of 3-connected graphs from their cycle matroids (cf. chapter 11, section 
7), and from many other functions of graphs (the area of graph equations comes 
under this heading, see Cvetkovic 1979). The unsettled status of the Graph iso- 
morphism problem is related to the non-reconstructibility from any of the known 
polynomial time computable invariants. 

While reconstruction problems (solved and unsolved) seem to pop up in nearly 
every topic considered, the term “Reconstruction problem” has been reserved for 
a single notorious member of this species in graph theory: the Kelly-Ulam recon- 
struction conjecture. It is this problem to which this brief last section is devoted. 
For more information and references we refer to the surveys mentioned in the 
preface to this chapter. 


7.1. Vertex reconstruction 


With every graph X = (V,E) we associate the multiset DY(X) of isomorphism 
types of its one-vertex-deleted subgraphs, i.e. the isomorphism type of X \ vu for 
each uv € V. We call DY(X) the deck of 1-vertex-deleted subgraphs. Analogously 
one can define the multiset D°(X), the deck of 1-edge-deleted subgraphs, and 
more generally, Dj(X) and Dj(X), the decks of k-vertex-deleted (k-edge-deleted, 
resp.) subgraphs. 

The graph X is vertex-reconstructible (or simply reconstructible) if it is deter- 
mined (up to isomorphism) by D’(X). Edge-reconstructibility is defined analo- 
gously. More generally we say that the graph invariant f(X) (cf. section 6.3) is 
vertex-reconstructible if f(X) is determined by D’(X). The Reconstruction conjec- 


ture says that all finite graphs with > 3 vertices are reconstructible (Kelly and Ulam 
in 1942). 
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The answer to the analogous question for directed graphs is negative: an infinite 
family of pairs of non-isomorphic tournaments with identical decks has been found 
by Stockmeyer (1977). 

It is known that almost every graph is vertex-reconstructible (Erdés). Indeed, this 
is an immediate consequence of the fact that almost every graph X has the follow- 
ing property: no pair of two-vertex-deleted subgraphs of X are isomorphic. This 
argument generalizes to smaller subgraphs: almost all graphs are reconstructible 
from their k-vertex-deleted subgraphs for all k < clogn for some constant c > 0. 

Some concrete classes of graphs are also known to be reconstructible. These 
include disconnected graphs, trees (Kelly in 1957), and some families of tree-like 
graphs. In particular, all graphs with < n edges are reconstructible. 

Among the reconstructible invariants, one should mention the degree sequence 
and a refinement of this: the sequence of degree sequences of the neighborhoods 
of the vertices (Nash-Williams 1978). Applying powerful counting techniques to 
reconstruction theory, Tutte (1979) has shown important polynomials associated 
with graphs to be reconstructible: the characteristic polynomial, the chromatic 
polynomial, and generalizations of these. 

The Reconstruction conjecture is false for infinite graphs (even for forests) but 
no counterexamples are known to the following variant, Halin’s conjecture: If two 
(finite or infinite) graphs with at least 3 vertices have the same deck of vertex- 
deleted subgraphs, then each is isomorphic to a subgraph of the other. 


7.2. Edge reconstruction 


It is known that a vertex-reconstructible graph with at least 4 edges is also edge- 
reconstructible (Greenwell 1971). In addition, however, large classes of graphs are 
known to be edge-reconstructible for which vertex-reconstructibility is open. The 
first result in this direction was Lovasz’s (1972b) who proved that if a graph has 
more edges than its complement then it is edge-reconstructible. Lovasz’s proof 
used a clever inclusion—-exclusion argument which was the basis of rapid further 
improvements. Miiller (1977) showed that graphs with m edges and n vertices are 
edge-reconstructible unless 2”"~! < n!, which means m < n - log, n. Nash-Williams 
(1978) modified Miiller’s proof and obtained the following lemma, from which 
Miiller’s bound is immediate. 


Lemma 7.1 (Nash-Williams). Suppose that the graph X =(V,E) is not edge- 
reconstructible. Then for every subset A C E such that \A \ E| is even, there exists 
a permutation o € Sym(V) such that EN EY =A. 


Lovasz observed that this lemma has the following immediate consequence. 
Corollary 7.2. If X = (V, £) is not edge-reconstructible then for every T C E, 
jo € Sym(V): T? C Ef > ZF, 


Pyber (1990) used this to derive that all Hamiltonian graphs are edge- 
reconstructible, with possibly a finite set of exceptions. Indeed, by Corollary 7.2, a 
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nonreconstructible Hamiltonian graph with n vertices and m edges would have at 
least 2’"-"-2 /n Hamilton cycles. But this is too much: Pyber proves that no graph 
has more than c”~" Hamilton cycles, where c= 1.977. Q 


The arguments used in the proofs of Lovasz, Miiller, Nash-Williams lend them- 
selves to a much more general treatment. The following framework was introduced 
by Mnukhin (1987). 

Let G < Sym(2) be a permutation group acting on the set 2. We say that two 
subsets A,, A, C £2 are G-isomorphic if AY = A; for some o € G. For any subset 
CM let I be the G-orbit of I’, i.e. the set of subsets of 2, G- isomorphic to T 

For AC 2 let the k-deleted deck D,(A) be the multiset of G-isomorphism 
classes of the (|A]—k)-element subsets of A. The set A is k-reconstruct- 
ible if it is determined (up to G-isomorphism) by its k-deleted deck D, (A). 

In particular, taking 2 to be the set of (5) pairs of elements of V and G = Sym(V) 
be the induced action of Sym(V) on 2, the concept of G-isomorphism of subsets 
of Q becomes the ordinary isomorphism of graphs on the vertex set V; and k- 
reconstructibility turns into the concept of reconstructibility from the deck of k- 
edge-deleted subgraphs. 

Generalizing Miiller’s theorem Mnukhin proves that if AC is not 1- 
reconstructible then 2!4!"! < |G}. 

Below we indicate a linear algebra approach introduced by Godsil et al. (1987) 
to extend Miiller’s result to k-reconstructibility for k > 2. Their technique is easily 
adapted to Mnukhin’s situation. 

Recall that a hypergraph ¥ C 2” is m-uniform if |E| =m for each E € ¥ 


Definition. The Vapnik-Chervonenkis dimension or VC dimension of a hypergraph 
¥ C 2° is the greatest integer ¢ for which there exists a subset A C 2 with |A] =t 
such that every subset B C A occurs as B = ANE for some E € ¥. 


For 0 <s <n the s-inclusion matrix 1(¥,s) of a hypergraph ¥ C 2” has rows 
indexed by the members F € ¥, columns indexed by subsets A C (2 with |A] = s, 
and entry | if A C F and 0 otherwise. The s*-inclusion matrix /‘(¥,s) has all the 
columns of the t-inclusion matrices for t = 0,1,2,...,s. . 

We say that & is s-independent if the rows of the s-inclusion matrix are lin- 
early independent (ie. /(¥,s) has full row-rank), and it is s*-independent if 
the rows of /*(#,s) are linearly independent. Clearly s-independence implies s*- 
independence, and for uniform hypergraphs, the converse also holds (Frank! and 
Wilson 1981). 


Theorem 7.3 (Frankl and Pach 1983). If ¥ is s*-dependent, then its VC dimension 
is at least s + 1. 


The proof follows from the proof of Corollary 4.2 in chapter 31. For a theory 
of the inclusion matrices, including this result, see Babai and Frankl 1992. 
The main lemma of Godsil et al. (1987) follows. 
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Lemma 7.4. If A, and A, have the same k-deleted deck D,(A;) but are not G- 
isomorphic, then the m-uniform set-system ¥ = At) U Ay is (m — k)-dependent 
(where m = \A;}). 


Proof. We prove the dependence of the rows of I(¥,m — k) by explicitly giving 
coefficients c(E) (& € #) for a lincar relation among them. For i = 1,2 let a; = 
|Gy4,}| (the size of the set-stabilizer of 4;). If E € Aj’ let c(E) = aj, and if E € AY 
let c(E) = —a>. To check that this linear combination of the rows is a zero row, 
consider a column indexed by a set T C 2 with |T| = m—k. The column has 
zeros except where T C F. So the entry for this column in the indicated linear 
combination of the rows will be a, times the number of E € AY’ with 7 C E, minus 
a times the number of E € AY with T C E. This is the number of o € G for which 
T C Af minus the number of o € G for which T C AJ. But this difference is zero 
because for every set T of size m — k, the number of o € G for which T° C A; is 
independent of 7. O 


Using this lemma and Theorem 7.3 we infer the following generalization of 
Miiller’s inequality. 


Theorem 7.5 (Godsil, Krasikov and Roditty 1987). If A C 2 is not k-reconstructible, 
then 2/4l-* < |GI. 


Proof. Combining the foregoing results we obtain that for ¥ as before, the VC- 
dimension of ¥ is >m—k-+1. Hence |¥| >2”°**!, while clearly |¥| < 2|G]. 


0 


In particular we obtain that if a graph with 1 vertices and m edges is not k- 
reconstructible then 2””-* <n!, or m<k +n log,n. 

For k =1 we also recover Lovadsz’s corollary to the Nash-Williams lemma 
(slightly improved). 


Theorem 7.6. If A C 22 is not 1-reconstructible, then for every TC A, 


{ae Gir’ CA} > 24, 


Proof. Let ¥ be as before (now k = 1). Since its VC dimension is > m —k+1 = 
m, there is a set A C 2 with |A| =m of which every subset is its intersection 
with some E € ¥. In particular, A € ¥. Now take any proper subset I" of A. 
Since A and A, have the same 1-deleted deck, we also have I’ C A, for some 
a € G, hence we have [7 C A for some 7 € G (since A € ¥ = A“ U AY). And 
\{o€ G:F? CA}|=|{o € G: F* C A*} |. But this latter is at least the number 
of proper subsets of A which contain I"’, because each of those is AM E for some 
E € $ (hence for some E = A” since the proper subsets are in the 1-deleted deck 
which A and A) share). The latter number is 2”-!1-1. 
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1. Introduction 


Optimizing means finding the maximum or minimum of a certain function, called 
the objective function, defined on some domain. Classical theories of optimization 
(differential calculus, variational calculus, optimal control theory) deal with the 
case when this domain is infinite. From this angle, the subject of combinatorial 
optimization, where the domain is typically finite, might seem trivial: it is easy to 
say that “we choose the best from this finite number of possibilities”. But, of 
course, these possibilities may include all trees on nm nodes, or all Hamilton 
circuits of a complete graph, and then listing all possibilities to find the best 
among them is practically hopeless even for instances of very moderate size. In 
the framework of complexity theory (chapter 29), we want to find the optimum in 
polynomial time. For this (or indeed, to do better than listing all solutions) the 
special structure of the domain must be exploited. 

Often, when the objective function is too wild, the constraints too complicated, 
or the problem size too large, it is impossible to find an optimum solution. This is 
quite frequently not just a practical experience; mathematics and computer 
science have developed theories to make intuitive assertions about the difficulty of 
certain problems precise. Foremost of these is the theory of NP-completeness (sec 
chapter 29). 

In cases when optimum solutions are too hard to find, algorithms (so-called 
heuristics) can often be designed that produce approximately optima! solutions. It 
is important that these suboptimal solutions have a guaranteed quality; e.g., for a 
given maximization problem, the value of the heuristic solution is at least 90% of 
the optimum value for every input. 

While not so apparent on the surface, of equal importance are algorithms, 
called dual heuristics, that provide (say for a maximization problem again) upper 
bounds on the optimum value. Dual heuristics typically solve so-called relaxations 
of the original problem, i.e., optimization problems that are obtained by dropping 
or relaxing the constraints so as to make the problem easier to solve. Bounds 
computed this way are important in the analysis of heuristics, since (both 
theoretically and in practice) one compares the value obtained by a heuristic with 
the value obtained by the dual heuristic (instead of the true optimum, which is 
unknown). Relaxations also play an important role in general solution schemes 
for hard problems like branch-and-bound. 

The historical roots of combinatorial optimization lie in problems in economics: 
the planning and management of operations and the efficient use of resources. 
Soon more technical applications came into focus and were modelled as com- 
binatorial optimization problems, such as sequencing of machines, scheduling of 
production, design and layout of production facilities. Today we see that discrete 
optimization problems abound everywhere. We encounter them in arcas such as 
portfolio selection, capital budgeting, design of marketing campaigns, investment 
planning and facility location, political districting, gene sequencing, classification 
of plants and animals, the design of new molecules, the determination of ground 
states, layout of survivable and cost-coefficient communication networks, posi- 
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tioning of satellites, design and production of VLSI circuits and printed circuit 
boards, sizing of truck fleets and transportation planning, layout of mass 
transportation systems and scheduling of buses and trains, assignment of workers 
to jobs such as drivers to buses and airline crew scheduling, design of unbreakable 
codes, etc. The list of applications seems endless; even in areas like sports, 
archeology or psychology, combinatorial optimization is used to answer important 
questions. We refer to chapter 35 for a detailed description of several real-world 
examples. 

There are basically two ways of presenting ‘combinatorial optimization”: by 
problems or by methods. Since combinatorial optimization problems abound in 
this Handbook and many chapters deal with particular problems, discuss their 
practical applications and algorithmic solvability, we organize our material 
according to the second approach. We will describe the fundamental algorithmic 
techniques in detail and illustrate them on problems to which these methods have 
been applied successfully. 

Some important aspects of combinatorial optimization algorithms we can only 
touch in this chapter. One such aspect is parallelism. There is no doubt that the 
computers of the future will be parallel machines. A systematic treatment of 
parallel algorithms is difficult since there are many computer architectures, based 
on different principles, and each architecture leads to a different model of parallel 
computational complexity. One very general model of parallel computation is 
described in chapter 29. 

Another important aspect is the on-line solution of combinatorial problems. We 
treat here ‘‘static’” problems, where all data are known before the optimization 
algorithm is called. In many practical situations, however, data come in one by 
one, and decisions must be made before the next piece of data arrives. The 
theoretical modelling of such situations is difficult, and we refrain from discussing 
the many possibilities. 

A third disclaimer of this type is that we focus on the worst-case analysis of 
algorithms. From a practical point of view, average-case analysis, i.e., the analysis 
of algorithms on random input, would be more important; but for a mathematical 
treatment of this, one has to make assumptions about the input distribution, and 
except for very simple cases, such assumptions are extremely difficult to justify 
and lead to unresolvable controversies. 

Finally, we should mention the increasing significance of randomization. \t is 
pointed out in chapter 29 that randomization should not be confused with the 
average case analysis of algorithms. Randomization is often used in conjunction 
with determinant methods, and it is an important tool in avoiding “traps” 
(degeneracies), in reaching tricky “corners” of the domain, and in many other 
situations. We do discuss several of these methods; other issues like derandomiza- 
tion (transforming randomized algorithms into deterministic ones), or the re- 
liability of random number generators will, however, not be treated here. 

Combinatorial optimization problems typically have as inputs methods based on 
determinant computation, combinatorial structures and numbers (e.g., a graph 
with weights on the edges). In the Turing machine model, both are encoded as 


—— 
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0-1 strings; but in the RAM machine model it is natural to consider the input as a 
sequence of integers. If the input also involves a combinatorial structure, then the 
combinatorial structure can be considered as a set of 0s and 1s. We denote by (a) 
the number of bits in the binary répresentation of the integer a; for a matrix 
A = (a,,) of integers, we define (A) = das) 

An algorithm (with an input sequence of integers a,,...,4@,,) Tuns in polyno- 
mial time (short: is polynomial) if it can be implemented on the RAM machine so 
that the number of bit-operations performed is bounded by a polynomial in the 
number of input bits (a,) +--- + (a,). Considering the input as a set of numbers 
allows two important versions of this notion. 

An algorithm (with an input sequence of integers a,,...,4,) is pseudo- 
polynomial, if it can be implemented on the RAM machine so that the number of 
bit-operations performed is bounded by a polynomial in |a,|+--- + [a,|. (This 
can also be defined on the Turing machine model: it corresponds to polynomial 
running time where the encoding of an integer a by a string of a 1s is used. Thus a 
pseudopolynomial algorithm is also called polynomial in the unary encoding.) 
Clearly, every polynomial algorithm is pseudopolynomial, but not the other way 
around: testing primality in the trivial way by searching through all smaller 
integers is pseudopolynomial but not polynomial. 

An algorithm (with input integers @,,... , @,,) is strongly polynomial if it can be 
implemented on the RAM machine in O(n‘) steps with numbers of O(((a,) +°°° 
+ (a,))°) digits for some ¢ >0. Clearly every strongly polynomial algorithm is 
polynomial, but not the other way around: e.g., the Euclidean algorithm is 
polynomial but not strongly polynomial. On the other hand, Kruskal’s algorithm 
for shortest spanning trees is strongly polynomial. 


Further reading. Bachem et al. (1983), Ford and Fulkerson (1962), Gondran and 
Minoux (1979), Grétschel et al. (1988), Lawler et al. (1985), Nemhauser et al. 
(1989), Nemhauser and Wolsey (1988), Schrijver (1986). 


2. The greedy algorithm 


Kruskal’s algorithm and matroids. The most natural principle we can try to build 
an optimization algorithm on is greediness: building up the solution by making the 
best choice focally. As the most important example of an optimization problem 
where this simple idea works, let us recall Kruskal’s algorithm for a shortest 
spanning tree from chapters 2,9 and 40. 

Given a connected graph G with n nodes, and a length function c: E(G)~— Z, 
we want to find a shortest spanning tree (where the length of a subgraph of G is 
defined as the sum of the lengths of its edges). The algorithm constructs the tree 


by finding its cdges e,,e,,...,¢, , one by one: 
@e, is an edge of minimum length; 
®e, is an edge such that e, € {e,,...,e,-1}, {e),.--,e,} iS a forest and 


c(e,) = min{c(e) |e F {e,,.-.,e,_,} and {e,,...,e,_,,e} is a forest} . 
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Each step of the algorithm makes locally the best choice: this is why it is called 
a greedy algorithm. Kruskal’s theorem (going back actually to Boruvka in 1926; 
see Graham and Hell (1985) for an account of its history) asserts that {e,: 1 =<i< 
n— 1} is a shortest spanning tree of G, i.e., the spanning tree constructed by the 
greedy algorithm is optimal. 

The graph structure plays little role in the algorithm; the only information 
about the graph used is that ‘“‘{e,,...,e,_,,e} is a forest”. In fact, this 
observation is one of the possible routes to the notion of matroids. — 

Let S be a finite set, c: S>Z,, a cost function on S, and ¥ C2 a hereditary 
family (independence system) of subsets of S, i.e., a set of subsets such that 
X €¥ and YCX implies Y € ¥. The goal is to find max{c(X) = Dey cle) |X E 
F}. In this more general setting, the greedy algorithm can still be easily 
formulated. It constructs a maximal set X by finding its elements e,,e,,... one 
by onc, as follows: 
®e, is an element of maximum cost in Uyes X, 
®e, is defined such that e, € {e,,....e,_,}, {ey,---.e,} EF and 


c(e,} = max{c(e)|eZ {e,,...,e,.,}, and {e,,...,e,.,,e} EF}. 


The algorithm terminates when no such element exists. We call a set X,,:= 
{e,,...,€,} obtained by this algorithm a greedy solution, and let X,,, denote an 
optimum solution to our problem. In chapter 9 it is shown that the greedy 
solution is optimal for every cost function if and only if (E, ¥) is a matroid. 

There are other problems where the greedy algorithm (with an appropriate 
interpretation) gives an optimum solution. The notion of greedoids (see chapter 9, 
or Korte et al. 1991) is an attempt to describe a general class of such problems. 
Further examples are polymatroids (see chapter 11), coloring of various classes of 
perfect graphs (see chapter 4) etc. In fact, an optimization model where the 
greedy solution is optimal was described by Monge in 1781! 


Greedy heuristic. But in most optimization problems, greed does not pay: the 
greedy solution is not optimal in general. We may still use, however, a greedy 
algorithm as a heuristic, to obtain a “reasonable”’ solution (and in practice it is 
very often used indeed). 

We measure the quality of the heuristic by comparing the value of the objective 
function at the heuristic solution with its value at the optimum solution. To be 
precise, consider, say, a minimization problem. For convenience, let us assume 
that every instance has a positive optimum objective function value v,,,, say. For 
a given instance, let u,,,, denote the objective value achieved by a given heuristic 
(if the heuristic includes free choices at certain points, we define v,,,,,, as the value 
achieved by the worst possible choices). We define the performance ratio of a 
heuristic as the supremum of v,,,,,/U,,, over all problem instances. The asymptotic 
performance ratio is the limsup of this ratio, assuming that v,,,>%. For a 


maximization problem, we replace this quotient by 0, /Uycur- 


; 
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Rank quotients. Consider a hereditary family # C2" and a weight function 
c: EZ, on E again. We want to find a maximum weight member of .¥. Let the 
greedy algorithm, just as above, give a set X,,, and let X,,, denote an optimum 
solution. 

Since we are maximizing, trivially 


c(X,,) =C(Xopr) - 
Define, for X CE, the upper rank of X by 
r(X) = max{|¥|: YCX, YEF}. 
We also define the lower rank of X by 
p(X) = ming{|¥|: YCX, YE ¥, AU EF withYCUCX}. 


Note that matroids are just those hereditary families with r = p. In general, define 
the rank quotient of (E, ¥) by 


Ye = min 2 1X CE,r(X)> o} . 


Note that y; is just the worst ratio between c(X,,) and c(X,,,) for 0-1 valued 
weight functions c. The following theorem of Jenkyns (1976) and Korte and 
Hausmann (1978) gives a performance guarantee for the greedy algorithm by 
showing that 0-1 weightings are the worst case. 


Theorem 2.1. For every hereditary family (E,#) and every weight function 


c: E—Z,, we have 


c(X,,) = aX yn) . 


As an application, consider the greedy heuristic for the matching problem. 
Here G is a graph, E = E(G), and ¥ consists of all matchings, i.e., sets of edges 
with no common endnode. We claim that y, =}. In fact, if XC E and M isa 
smallest non-extendible matching in X then the 2|M| endpoints of the edges in M 
cover all edges in X, and hence a maximum matching in X cannot have more than 
2|M| edges. Thus we obtain that for every weighting, the greedy algorithm for the 
matching problem has performance ratio at most 2. 


Greedy blocking sets. We discuss a greedy heuristic with a somewhat more 
involved analysis. Let (V, #) be a hypergraph. A blocking set or cover of the 
hypergraph is a subset SCV that meets (blocks) every member of #. Let S,,, 
denote a blocking set with a minimum number of elements, and define the 
covering (or blocking) number by 1(#):= ISoptl- (To compute 7(#) is NP-hard in 
general; cf. also chapters 7 and 24.) 

A greedy blocking set S,, is constructed as follows. Choose a vertex v, of 
maximum degree. If v,,...,u, have been selected, choose uv, ,, to block as many 
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members of # not blocked by v,,...,u, as possible. We stop when all members 
of # are blocked. 

Concerning the performance of this algorithm, we have the following bound 
(Johnson 1974, Stein 1974, Lovasz 1975). 


Theorem 2.2. Let # be a hypergraph with maximum degree A. Then for the size of 
any greedy blocking set S,,. 


1 1 
ISel<(1 +5 4+ -+-+q)r(). 


(The “error factor” on the right-hand side is less than 1 + In A. It is easy to see 
that the ratio 1+ 1/2+---+1/A cannot be improved.) 


Proof, Let k, denote the number of vertices in S,, selected in the phase when we 
were able to block exactly i new edges at a time. Let #, denote the set of edges 
first blocked in this phase. So |2,| = ik,, and 


SS he thease ER 


Now consider the optimum blocking set S,,,. Every vertex in S,,, (in fact, every 
vertex of #), blocks at most / edges from #, U #,_,U---U,, since a vertex 
blocking more from this set should have been included in the greedy blocking set 
before phase i. Hence 


1 . 
Sopel > | (ik; + F— Dky_y +++ + ky). (1) 


Multiplying this inequality by 1/(+ 1) for i=1,...,4—1, and by 1 for i=A, 
and then adding up the resulting inequalities, we obtain that 


1 i 
(14 fbf) spel Bh to + hy = IS ao 


al .. 

Note that in this proof, we do not directly compare |S,,| with |S,,,|; the latter is 
not available. Rather, we use a family (1) of lower bounds on |S,,,|. For each 
i, (1/1)(H, U- ++ U 4) can be viewed as a fractional matching (see chapters 7, 24), 
and so the theorem can be sharpened by using the fractional matching (or cover) 
number 7* on the right-hand side. In fact, the fractional cover number 7* is a 
relaxation of the cover number 7, obtained by formulating 7 as the optimum valuc 
of an integer linear program and dropping the integrality constraints. 


Greedy travelling salesman tours. Recall the travelling salesman problem (TSP): 
given a graph G=(V,E) on n 23 nodes, and “distances” c: E->Z,, find a 
Hamilton circuit with minimum length. Usually it is not an essential restriction of 
generality to assume that G is a complete graph. Moreover, in many important 
applications, the distance function c satisfies the triangle inequality: c,, + ¢,, > Ci, 
for any three distinct nodes i, j, and k. We shall always restrict ourselves to the 
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special case of the complete graph with lengths satisfying the triangle inequality 
(called the metric case). 

The travelling salesman problem is NP-hard. Lincar programming provides a 
practically quite efficient method to solve it (cf. section 8). Here we discuss two 
greedy heuristics. 

The first one, called NEAREST NEIGHBOR heuristic, is an obvious idea: 
Choose some arbitrary node and visit it; from the last node visited go to the 
closest not yet visited node; if all nodes have been visited, return to the first node. 
This heuristic does make locally good choices, but it may run into traps. Figure 
2.1 shows the result of a NEAREST NEIGHBOR run for a TSP consisting of 52 
points of interest in Berlin, starting at point 1, the Konrad-Zuse-Zentrum. It is 
clear that this is far from being optimal. In fact, series of metric n-city TSP 
instances can be constructed where the tour built by the NEAREST NEIGHBOR 
heuristic is O(log n) as long as the optimum tour length (Rosenkrantz et al. 1977). 

The second-greedy heuristic, called NEAREST INSERTION, will never show 
such a poor performance. It works as follows. We build up a circuit going through 
more and more nodes. 

@ Start with any node uv, (viewed as a circuit with one node). 

® Let T, be a circuit of length k already constructed. Choose a node u, not on T, 
and a node u, on 7, such that the distance Cugy is minimal. Delete one of the 
edges of T, incident with u,, and connect its endpoints to v,, to get a circuit 

Tear: 

e After n steps we get a Hamiltonian circuit T,,,,- 
Another way to describe this heuristic is the following. The tree F formed by 


Figure 2.1. A nearest neighbor tour of a 52-city-problem. 
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the edges u,u, is constructed by Prim’s algorithm, (see, e.g., Cormen et al. 1990) 
and so it is a shortest spanning tree of G. We double each edge of F to obtain an 
Eulerian graph. An Euler tour of this visits every vertex at least once; making 
shortcuts, we obtain a tour that visits all nodes exactly once. 


Theorem 2.3. T,,,, iS at most twice as long as the optimum tour. 


Proof. Let T,,, denote an optimum tour. We make two observations. First, 
deleting any edge from 7,,,, we get a spanning tree. Hence 


C(T yy) 2 C(F) - (2) 


Second, inserting v, into T, we increase T,’s length by at most 2c, ,,; adding up 
these increments, we get 


C(T.,,.) = 2c(F) . ] 


nins 


Note that this argument (implicitly) uses a dual heuristic also: inequality (2) 
makes use of the fact that the length of the shortest spanning tree is a lower 
bound on the length of the shortest tour. The value of this dual heuristic is easily 
found by the greedy algorithm. In fact, a somewhat better lower bound could be 
obtained by considering unicyclic subgraphs, i.e., those subgraphs containing at 
most one circuit. The edge-sets of such subgraphs also form a matroid. The bases 
of this matroid are connected spanning subgraphs containing exactly one circuit; 
for this chapter, we call such subgraphs I-trees. A 1-tree with minimum length 
can be found easily by the greedy algorithm, and since every tour is a 1-tree, this 
gives a lower bound on the minimum tour. We will see in section 9 how to further 
improve this lower bound. 

There are several other greedy-like heuristics for the travelling salesman 
problem, such as farthest insertion, cheapest insertion, sweep, savings etc. Many 
of them do not have, however, a proven constant performance ratio, although 
some of them work better in practice than the nearest insertion heuristic (see 
Reinelt 1994). 

The heuristic for the traveling salesman problem with the best known per- 
formance ratio is due to Christofides (1976). It uses the shortest spanning tree F, 
but instead of doubling its edges to make it Eulerian, it adds a matching M with 
minimum length on the set of nodes with odd degree in F. (Such a matching can 
be found in polynomial time, see chapter 3.) It is easy to sce that 2c(M) <c(T,,,), 
and hence the Christofides heuristic has performance ratio of at most 3. 
Bin packing. Let a,,...,a, <1 be positive real numbers (‘‘weights’”). We would 
like to partition them into classes (‘‘bins’”) B,,..., B, so that the total weight of 
every bin is at most 1. Our aim is to minimize the number k of bins. Let Root be 
this minimum. To compute k,,, is NP-hard. 

A trivial lower bound on how well we can do is the roundup of the total weight 
w:=)),a,. The following simple (greedy) heuristic already gets asymptotically 
within a factor of 2 to this lower bound. 
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NEXT-FIT HEURISTIC. We process a,, a; .. . one by one. We put them into one 
bin until putting the next a, into the bin would increase its weight over 1. Then we 
close the bin and start a new one. We denote by k,, the number of bins used by 
this heuristic. 


Theorem 2.4. k,,;<2w+1;, k,~<2k,,,- 
Proof. Let k:=k,,, and denote by w; the weight put in bin B, by the NEXT-FIT 
heuristic (1 <i <k). Then clearly w, + w,,, > 1 (since otherwise the first weight in 
B,,, should have been put in B,). If k is even, this implies 


w=(w, tw) + (wi tw) to + (wy, tm,)>k/2, 
and hence 


k<2w <2k 


opt: 
If k is odd, we obtain 

w=t(w, + (Ww, +w,) + (Ww, 4+ Wy) +--+ + (Wy, tH) + W,) 
k-1 


3k -lt+w, tw)>—5—, 


= 


whence 


k<2wt+1<2k,,+1, 


opt 


and hence, k <2k,,, —1<2k a 


opt’ 


The following better heuristic is still very “greedy”’. 


FIRST-FIT HEURISTIC. We process @,,@,... one by one, putting each a; into 
the first bin into which it fits. We denote by k,, the number of bins used by this 
heuristic. 


The following bound on the performance of FIRST FIT, due to Garey et al. 
(1976), is substantially more difficult to prove than the previous one. 


Theorem 2.5. k,,< [i3k.,.]- There exist lists with arbitrarily large total weight for 
which kee = [43 Kol — 1. 


A natural (still “greedy’”) improvement on the FIRST-FIT heuristic is to 
preprocess the weights by ordering them decreasingly, and then apply FIRST- 
FIT. We call this the FIRST-FIT DECREASING heuristic, and denote the 
number of bins it uses by k,,,. This preprocessing does improve the performance, 
as shown by the following theorem of Johnson (1973). 
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Theorem 2.6. kg = [“¢k,,,] + 4. There exist lists with arbitrarily large total weight 
for which ky = [4 Kop) - 

There are other greedy-like heuristics for bin packing, e.g., best fit, whose 
performance ratio is similar; the heuristic called harmonic fit is slightly better. 
(See Coffman et al. 1984 for a survey.) More involved heuristic algorithms for the 
bin packing problem use linear programming and achieve an asymptotic per- 
formance ratio arbitrarily close to 1; see section 8. 

The first two heuristics above are special in the sense that they are on-line: each 
weight is placed in a bin without knowing those that follow, and once a weight is 
placed in a bin, it is never touched again. On-line heuristics cannot achieve as 
good a performance ratio as general heuristics; it is easy to show that for any 
on-line heuristic for the bin packing problem there exists a sequence of weights 
(with arbitrarily large total weight) for which it uses at least 3k,,,, bins. This lower 
bound can be improved to 1.54, van Vliet (1992). 


The knapsack problem. Given a knapsack with total capacity b, and n objects 


with weights a,,...,a, of value c,,...,¢,, we want to pack as much value into 
the knapsack as possible. In other words, given positive integers 
b,a@,,...,4,,€,,...,¢,, We want to maximize )), c,x, subject to the constraints 


Xvi a,x, <b and x; € {0,1}, i=1,...,”. To exclude trivial cases, we may assume 
that a,,...,a, <b. We denote by C,,, the optimum value of the objective 
function. 

The knapsack problem is NP-hard; we will see in section 9, however, that one 
can get arbitrarily close to the optimum in polynomial time. Here we describe a 
greedy algorithm that has performance ratio at most 2. 

It is clear that we want to select objects with small weight and large value; so it 
is natural to put in the knapsack an object with largest ‘“‘weight density” c,/a;. 
Assume that the objects are labelled so that c,/a, =c,/a,2°+-2c,/a,. Then the 
greedy algorithm consists of selecting x,,x,,...,x,, © {0,1} recursively as fol- 
lows: 

_ ( , ifa,<b-D\ax,, 
: 0, otherwise . 


Theorem 2.7. For the solution x,,...,x,, obtained by the (weight density) greedy 
algorithm, we have 


n 
>> CX; = Cop, ~ Max Cc; - 
isd ‘ 

It follows that comparing the greedy solution with the best of the trivial 


solutions (select one object), we get a heuristic with performance ratio at most 2. 


Proof. Let x be the solution found by the greedy algorithm, and y be an optimum 
solution. If x is not optimal, there must be an index j such that x, = 0 and y, = 1; 
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consider the least such j. Then we have for the greedy solution, 


Cy t= Dok, 2 Gi, Dey + Dee -y) 
i be (3) 
aay t Be ale ~»)=Zey,+2(Zaa,-S ay). 


Since j was not chosen by the greedy algorithm, we have here 
A 
> ax,2b-a,. 


Furthermore, the feasibility of y implies that 


n 


j 
Day,<b- DY ay,. 


i=t i=j+t 


Substituting in (3), we obtain 


3. Local improvement 


The greedy algorithm and its problem-specific versions belong to a class of 
algorithms sometimes called (one-pass) construction heuristics. A myopic rule is 
applied and former decisions are not reconsidered. The purpose of these 
heuristics is to find a “reasonable” feasible solution very fast. But, of course, a 
locally good choice may lead to a globally poor solution. 


Exchange heuristics for the travelling salesman problem. We have seen such 
unpleasant behavior in fig. 2.1: initially, short connections are chosen but, in the 
end, very long steps have to be made to connect the “forgotten” nodes. This 
picture obviously calls for a “repair” of the solution. For instance, replacing the 
edges from 12 to 28 and from 11 to 29 by the edges from 11 to 12 and 28 to 29 
results in considerable saving. Obviously, further improvements of this kind are 
possible. 

The 2-OPT heuristic formalizes this idea. It starts with some tour T, for 
instance a random tour or a tour obtained by a construction heuristic. Then it 
checks, for all pairs of non-adjacent edges uv, xy of T, whether the unique tour S$ 
formed by deleting these two edges and adding two edges, say S:= T\{uv, xy} U 
{ux, vy} is shorter than T (this is the case, ¢.g., when the segments uv and xy 
cross in the plane). If so, T is replaced by S and the exchange tests are repeated. 
Otherwise the heuristic stops. Figure 3.1 shows the result of 2-OPT started with 
the tour in fig. 2.1. 

There is an obvious generalization of this method: instead of removing two 
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Figure 3.1. A 2-OPT tour of a 52-city-problem. 


edges, we delete r non-adjacent edges from T. Then we enumerate all possible 
ways of adding r other edges such that these r edges together with the remaining r 
paths form a tour. If the shortest of these tours is shorter than the present tour T 
we replace T by this shorter tour and repeat. This method is called the r-OPT 
heuristic. 

These heuristics are prototypical local improvement techniques that, in general, 
work as follows. We have some feasible solution of the given combinatorial 
optimization problem. Then we do some little operations on this solution, such as 
removing some elements and adding other elements, to obtain one or several new 
solutions. If one of the new solutions is better than the present one, we replace 
the present one by the new best solution and repeat. 

The basic ingredient of such improvement or exchange heuristics is a rule that 
describes the possible manipulations that are allowed. This rule implicitly defines, 
for every feasible solution, a set of other feasible solutions that can be obtained 
by such a manipulation. Using this interpretation, we can define a digraph D 
whose vertex set is the set 92 of feasible solutions of our combinatorial 
optimization problem (this is typically exponentially large) and where an arc 
(5S, T) is present if T can be obtained from S by the manipulation rule. 

Now, local improvement heuristics can be viewed as algorithms that start at 
some node of this digraph and search, for a current node, the successors of the 
node. If they find a better successor they go to this successor and repeat. The 
term local search algorithms that is also used for these techniques derives from 
this interpretation. 

There are many important algorithms that fit in this scheme: the simplex 
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method (see chapter 30), basis reduction algorithms (chapter 20), etc. The 
simplex method is somewhat special in-the sense that it only gets stuck when we 
have reached the optimum. Usually, local improvement algorithms may run into 
local optima, i.e., solutions from which the given local manipulations do not lead 
to any better solution. 


Maximum cuts. Let us look at another example. Suppose we are given a graph 
G=(V,E) and we want to find a cut 6(W) = {7 EE: ieGW, jfW}, WCV, of 
maximum size. (This is the cardinality max-cut problem for G.) The problem has a 
natural weighted version, where each edge has a (say, rational) weight, and we 
want to find a cut with maximum weight. 

We start with an arbitrary subset WCV. We check whether W (or V\W) 
contains a node w such that fewer than half of its neighbors are in V\W) (or W). 
If such a node exists we move it from W to V\W (or from V\W to W). Otherwise 
we stop. 

Termination of this single exchange heuristic in O(n’) time is guaranteed, since 
the size of the cut increases in every step. The cut produced by this procedure 
obviously has a size that is at least one half of the maximum cardinality of a cut in 
G, and in fact, it can be as bad as this, as the complete bipartite graph shows. 

The single exchange heuristic has an obvious weighted version: we push a node 
w to the other side if the sum of the weights of edges linking w to nodes on the 
other side is smaller than the sum of the weights of the edges linking w to nodes 
on the side of w. This version, of course, also terminates in finite time, but the 
number of steps may be exponential (Haken and Luby 1988) even for 4-regular 
graphs. On the other hand Poljak (1993) proved that the single exchange heuristic 
for the weighted max-cut problem terminates in polynomial time for cubic graphs 
(see section 9). 

For a typical combinatorial optimization problem, it is casy to find many local 
manipulation techniques. For instance, for the max-cut problem we could try to 
move several nodes from one side to the other or exchange nodes between sides; 
for the TSP we could perform node exchanges instead of edge exchanges, we 
could exchange whole sections of a tour, and we could combine these techniques, 
or we could vary the number of edges and/or nodes that we exchange based on 
some criteria. In fact, most people working on the TSP view the well-known 
heuristic of Lin and Kernighan (1973) and its variants as the best local 
improvement heuristics for the TSP known to date (based on the practical 
performance of heuristics on large numbers of TSP test instances). This heuristic 
is a dynamic version of the r-OPT heuristic with varying r. 

The last statement may suggest that improvement heuristics are the way to 
solve (large scale) TSP instances approximately. However, there are some 
practical and theoretical difficulties with respect to running times, traps and 
worst-case behavior. 


Running time of exchange heuristics. A straightforward implementation of the 
r-OPT heuristic, for example, is hopelessly slow. It takes (") tests to check 
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whether a tour can be improved by an r-exchange. Even for r = 3, the heuristic 
runs almost forever on medium size instances of a few thousand nodes. To make 
this approach practical, a number of modifications limiting the exchanges 
considered are necessary. They are usually based on insights about the probability 
of success of certain exchanges, or on knowledge about special structures (an 
instance might be geometrical, e.g., given by points in the plane and a distance 
function). Well-designed fast data structures play an important role. The issue of 
speeding up TSP heuristics is treated in depth, e.g., in Johnson (1990), Bentley 
(1992), and Reinelt (1994). With these techniques TSP instances of up to a 
million cities have been solved approximately. Such observations apply to many 
other combinatorial optimization problems analogously. 

It may sound strange, but for many exchange heuristics there is no proof that 
these heuristics terminate in polynomial time. This is the case even with such a 
basic and classical algorithm as the simplex method! For certain natural pivoting 
rules we know that they may lead to exponentially many iterations, while for 
others, it is not known whether or not they terminate in polynomial time; no 
pivoting rule is known to terminate in polynomial time. As another, simpler 
example, we mention that although we could prove an O(n’) running time for the 
cardinality version of the max-cut heuristic, its weighted version is not polyno- 
mial, as mentioned above. 

A single pass through the loop of the r-OPT heuristic for the TSP takes O(n’) 
time but it is not clear how to bound the number of tours that have to be 
processed before the algorithm terminates with an r-OPT tour, i.e., a tour that 
cannot be improved by an r-exchange. Computational experience, however, 
shows that exchange heuristics usually do not have to inspect too many tours until 
a “local optimum’? is found. 


Quality of the approximation. Although the solution quality of local improvement 
heuristics is often quite good, these heuristics may run into traps, for instance 
r-OPT tours, whose value is not even close to the optimum. It is, in general, 
rather difficult to prove worst-case bounds on the quality of exchange heuristics. 
For an example where a performance ratio is established, see the basis reduction 
algorithm in chapter 19. 

It is probably fair to say that, for the solution of combinatorial optimization 
problems appearing in practice, fast construction heuristics combined with local 
improvement techniques particularly designed for the special structures of the 
application arc the real workhorses of combinatorial optimization. That is why 
this machinery receives so much attention in the literature and why new little 
tricks or clever combinations of old tricks are discussed intensively. Better 
solution qualities or faster solution times may result in significant cost savings, in 
particular for complicated problems of large scale. 


Aiming at local optima, There are several examples when we are only interested 
in finding a local optimum: it is the structure of the local optimum, and not the 
value of the objective function, that concerns us. A theoretical framework for 


Combinatorial optimization 1557 


such ‘“‘polynomial local search problems” was developed by Johnson et al. (1988). 
Analogously to NP, this class also has complete (hardest) problems; the weighted 
focal max-cut problem is one of these (Schaffer and Yannakakis 1991). 

Consider an optimum solution W of the max-cut problem for a graph G= 
(V, E). Clearly, every node is connected to at Icast as many points on the 
opposite side of the cut than on its own side. If we only want a cut with this 
property, any locally optimal cut (with respect to the single exchange heuristic) 
would do. ; 

Assume now that we want to solve the following more general problem: given 
two functions f, g: V->Z, such that f(v) + g(v) =d,,(v) — 1 for all v EV, find a 
subset W CV such that for each node v, the number of nodes adjacent to u on its 
own side of the cut is at most 


fe), ifuEew, 
giv), ifueEeV\w. 


It is not difficult to guess an objective function (over cuts) for which such cuts are 
exactly local optima: 


o(W):= |5(W)| + f(W) + g(V\W). 


Once this is found, it follows that a cut with the desired property exists, and also 
that it can be found in polynomial time by local improvement. A much more 
difficult, but in principle similar, application of this idea is the proof of 
Szemerédi’s regularity lemma (sce chapter 23). Here again a tricky (quadratic) 
objective function is set up, which is locally improved until a locally almost 
optimal solution is found; the structure of such a solution is what is needed in the 
numerous applications of the Regularity lemma (sce also Alon ct al. 1994 for the 
algorithmic aspects of this procedure). 

A beautiful example of turning a structural question into an optimization 
problem is Tutte’s (1963) proof of the fact that every 3-connected planar graph 
has a planar embedding with straight edges and convex faces. He considers the 
edges as rubber bands, fixes the vertices of one face at the vertices of a convex 
polygon, and lets the remaining vertices find their equilibrium. This means 
minimizing a certain quadratic objective function (the energy), and the optimality 
criteria can be used to prove that this equilibrium state defines a planar 
embedding with the right propertics. The algorithm can be used to actually 
compute nice embeddings of planar graphs. A similar method for connectivity 
testing was given by Linial et al. (1988). 


Randomized exchange. A very helpful idea to overcome the problem of falling 
into a trap is to randomize. In a randomized version of local search, a random 
neighbor of the current feasible solution is selected. If this improves the objective 
function, the current feasible solution is replaced by this neighbor. If not, it may 
still be replaced, but only with some probability Icss than 1, depending on how 
much worse the objective value at the neighbor is. This relaxation of the strict 
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descent rule may help to jump out of traps and eventually reach a significantly 
better solution. 

A more general way of looking at this method is to consider it as generating a 
random element in the set 2 of feasible solutions, from some given probability 
distribution Q. Let f: QR, be the objective function; then maximizing f over 
is just the extreme case when we want to generate a random element from a 
distribution concentrated on the set of optimum solutions. If, instead, we 
generate a random point w from the distribution Q in which Q(v) is proportional 
to exp(f(v)/T), where T is a very small positive number, then with large 
probability w will maximize f. In fact, a randomized algorithm that finds a 
solution that is (nearly) optimal with large probability is equivalent to a procedure 
of generating a random element from a distribution that is heavily concentrated 
on the (nearly) optimal solutions. 

To generate a random clement from a distribution over a (large and compli- 
cated) set is of course a much more general question, and is a major ingredient in 
various algorithms for enumeration, integration, volume computation, simulation, 
statistical sampling, etc. (see Jerrum et al. 1986, Dyer and Frieze 1992, Sinclair 
and Jerrum 1988, Dyer et al. 1991, Lovasz and Simonovits 1992 for some of the 
applications with combinatorial flavor). An efficient general technique here is 
random walks or Markov chains. Let G = ({Q, E) be a connected graph on 2, and 
assume, for simplicity of presentation, that G is non-bipartite and d-regular. If we 
start a random walk on G and follow it long enough, then the current point will 
be almost uniformly distributed over 2. How many steps docs “long enough” 
mean depends on the spectrum, or in combinatorial terms, on global connectivity 


properties called expansion rate or conductance, of the graph (see also chapter 
31). 


The Metropolis filter. In optimization, we are interested in very non-uniform, 
rather than uniform, distributions. Fortunately, there is an elegant way, called the 
Metropolis filter (Metropolis et af. 1953), to modify the random walk, so that it 
gives any arbitrary prescribed probability distribution. Let F: Q—R,. Assume 
that we are at node v. We choose a random neighbor u. If F(u) = F(v) then we 
move to u; else, we flip a biased coin and move to u only with probability 
F(u)/F(v), and stay at v with probability 1 — F(u)/F(v). 

Let Q, denote the probability distribution on 2 defined by the property that 


Q,{v) is proportional to F(v). The miraculous property of the Metropolis filter is 
the following. 


Theorem 3.1. The stationary distribution of the Metropolis-filtered random walk is 
F : 


So choosing F(v) = exp(f(v)/T), we get a randomized optimization algorithm. 
Unfortunately, the issue of how long one has to walk gets rather messy. The 
techniques to estimate the conductance of a Metropolis-filtered walk are not 
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general enough, although Applegate and Kannan (1990) have been able to apply 
this technique to volume computation. 


Simulated annealing. Coming back to optimization, let us follow Kirckpatrick ct 
al. (1983), and call the elements of 2 states (of some physical system), 1/F(v) the 
energy of the state, and 7 the temperature. \n this language, we want to find a 
state with minimum (or almost minimum) energy. A Metropolis-filtered random 
walk means letting the system get into a stationary state at the given temperature. 

The main, and not quite understood, issue is the choice of the temperature. If 
we choose T large, the quality of the solution is poor, i.e., the probability that it 
is close to being optimal is small. If we choose T small, then there will be barriers 
of very small probability (or, equivalently, with very large energy) between local 
optima, and it will take extremely long to get away from a local, but not global 
optimum. 

The technique of simulated annealing suggests to start with the temperature T 
sufficiently large, so that the random walk with this parameter mixes fast. Then 
we decrease 7 gradually. In each phase, the random walk starts from a 
distribution which is already close to the limiting distribution, so there is hope 
that the walk will mix fast. (A similar trick works quite well in integration and 
volume computation; see Lovasz and Simonovits 1992.) Theoretical and practical 
experiments have revealed that it matters a lot how long we walk in a given phase 
(cooling schedule). 

There are many empirical studies with this method; see Johnson ct al. 
(1989, 1991) or Johnson (1990). There are also some general estimates on its 
performance (Holley and Stroock 1988, Holley et al. 1989). Examples of 
problems, in particular of the matching problem, are known where simulated 
annealing performs badly (Sasaki and Hajek 1988, Sasaki 1991, Jerrum 1992), 
and some positive results in the case of the matching problem are also known 
(Jerrum and Sinclair 1989). The conclusion that can be drawn at the moment is 
that simulated annealing is a potentially valuable tool (if one can find good 
cooling schedules and other parameters), but it is in no way a panacea as was 
claimed in some papers pionccring this topic. 

Approaches named taboo search, threshold accept, evolution or genetic 
algorithms and others are further variants and enhancements of randomized local 
search. The above judgement of simulated annealing applies to them as well, see 
Johnson (1990). 


4. Enumeration and branch-and-bound 


There is a number of interesting combinatorial optimization problems for which 
beautiful polynomial time algorithms exist. We will explain some of them in 
subsequent sections. We now address the issue of finding an optimum solution for 
an NP-hard problem. In the previous two sections we have outlined heuristics that 
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produce some feasible and hopefully good solutions. Such a solution may even be 
optimal right away. But how does one verify that? 

The basic trouble with integer programming and combinatorial optimization is 
the non-existence of a sensible duality theory. The duality theorem of linear 
programming (see chapter 30), for instance, can be used to prove that some given 
feasible solution is optimal. In the (rare) cases where duality theorems in integral 
solutions like the max-flow min-cut theorem (see chapter 2 or 30) or the 
Lucchesi-Younger theorem (see chapter 2) exist, one can usually derive a 
polynomial time solution algorithm. For NP-hard problems one should not expect 
to find such theorems. Unfortunately, nothing better is known than replacing such 
a theory by brute force. 

Trivial running time estimates reveal that the obvious idea of simply enumerat- 
ing the finitely many solutions of a combinatorial optimization problem is 
completely impractical in general. For instance, computing the length of all § - 15! 
(~0.6* 10'") tours of a 16 city TSP instance takes about 90 hours on a 30 MIPS 
workstation. Even a teraflop computer will be unable to enumerate all solutions 
of a ridiculously small 30 city TSP instance within its lifetime. 

Unless P = NP, there is no hope that we will be able to design algorithms for 
NP-hard problems that are asymptotically much better than enumeration. 
However, we can try to bring problem instances of reasonable sizes (appearing in 
practice, say) into the realm of practical computability by enhancing enumeration 
with a few helpful ideas. 

The idea of the branch-and-bound approach is to compute tight upper and 
lower bounds on the optimum value in order to significantly reduce the number of 
enumerative steps. To be more specific, let us assume that we have an instance of 
a minimization problem.: Let 2 be its set of feasible solutions. 

To implement branch-and-bound, we need a dual heuristic (relaxation), i.c., an 
efficiently computable lower bound on the optimum valuc. This dual heuristic will 
also be called for certain subproblems. 

We first run some construction and improvement heuristics to obtain a good 
feasible solution, say 7, with value c(7), which is an upper bound for the 
optimum value c,,,. 

Now we resort to enumeration. We split the problem into two (or morc) 
subproblems. Recursively solving these subproblems would mean straightforward 
enumeration. We can gain by maintaining the best solution found so far and 
computing, whenever we have a subproblem, a lower bound for the optimum 
value of this subprobiem, using the dual heuristic. Jf this value is larger than the 
value of the best solution found so far, we do not have to solve this subproblem. 

To be more specific, let us discuss a bit the two main ingredients, branching and 
bounding. 

We assume that the splitting into subproblems (the branching) is such that the 
set Q of feasible solutions is partitioned into the sets 2’ and 1” of feasible 
solutions of the subproblems. We also assume that the subproblems are of the 
same type (e.g., the dual heuristic applies to them). It is also important that the 
branching step requires little bookkeeping and is computationally cheap. If Q 


Combinatorial optimization 1561 


consists of subsets of a sct 5S, then a typical split is to choose an clement e€ S and 
to set 


N= TENleEN, A= EN\eED.— 


The bounding is usually provided by a relaxation: by problem-specific in- 
vestigations we introduce a new problem, whose set of feasible solutions is I, say, 
such that QCIF and the objective function for Q extends to I. Suppose X 
minimizes the objective function over FT, then the value c(X) provides a lower 
bound for c,,, since all cements of 2 participated in the minimization process. If 
X is, in fact, an element of 2 we clearly have found an optimum solution of 2. 
(If X22, we may still be able to make use of it by applying a construction 
and/or improvement heuristic that starts with X and ends with a solution Y, say, 
in Q. If @Y)<c(T) we set T:= Y to keep track of our current best solution.) 

To give some examples, useful relaxations for the symmetric TSP are perfect 
2-matchings (unions of disjoint circuits that cover all nodes) or 1-trees (cf. sections 
2 and 9). For the asymmetric TSP a standard relaxation is obtained by considering 
unions of directed circuits that cover all nodes (which can be casily reduced to a 
bipartite perfect matching problem), or the r-arborescence problem. 

A particularly powerful method is based on LP-rclaxations. This is covered in 
depth in section 8. There are some general methods to improve relaxations; one 
technique is called Lagrangian relaxation and will be discussed in section 9. 

Returning to the algorithm, we maintain a list of unsolved subproblems, and a 
solution 7 that is the current best. In the general step, we choose an unsolved 
subproblem, say 2‘, from the list and remove it. We optimize the objective 
function over the relaxation I” of Q'. Let X be an optimum solution. There are 
several possibilities. 
® First, X may be feasible for 2°. In this case we have found an optimum over 12' 

and can completely eliminate all elements of Q' from the cnumcration process. 

One often says that this branch is fathomed. If c(X)<c(T), we reset T:= X 

also. 
© Second, if c(X) =c(T) then no solution in I’ and hence no solution in ' has a 

value that is smaller than the current champion. Hence this branch is also 

fathomed and we can eliminate all solutions in 22‘. 
® Third, if c(X)<c(T) (and X is not feasible for 2'), we have done the 

computation in vain. (We may still try to make use of X to obtain a solution 

better than T as above.) We split Q' into two or more pieces, and put these on 
our list of unsolved subproblems. 

The branch-and-bound method terminates when the list of subproblems is 
empty. The iteratively updated solution T is the optimum solution, Termination 
is, of course, guarantecd if the set 2 of feasible solutions is finite. 

Although the global procedure is mathematically trivial, it is a considerable 
piece of work to make it computationally effective. The efficiency mainly depends 
on the quality of the lower bound used. Most of the mathematics that is 
developed for the solution of hard problems is concerned with the invention of 


1562 M. Grétschel and L. Lovasz 


better and better relaxations, with their structural properties and with fast 
algorithms for their solution. 


5. Dynamic programming 


Dynamic programming is a general technique for optimum decision making. It 
was originally developed (Bellman 1957) for the solution of discrete-time 
sequential decision processes. The process starts at a given initial state. At any 
time of the process we are in some state and there is a set of states that are 
reachable from the present state. We have to choose one of these. Every state has 
a value and our objective is to maximize the value of the terminating state. Such 
an optimization problem is called a dynamic program. 

Virtually any optimization problem can be modeled by a dynamic program. 
There is a recursive solution for dynamic programs which, however, is not 
efficient in general. But the dynamic programming model and this recursion can 
be used to design fast algorithms in cases where the number of states can be 
controlled. We illustrate this by means of a few.examples. 


t 
The subset-sum problem. Given positive integers 4@,,4,,...,a@,,b, decide 
whether there exist indices | <i, <i, <+--<é, <a for some k such that a, +--- 
+a, =6. This problem, called subset-sum problem, is NP-complete; however, 
there is a pseudopolynomial algorithm to solve it. 

First, consider an obvious algorithm using enumeration. Clearly the subset-sum 
problem has a solution for a given input (a,,...,a4,,6), if and only if it has a 
solution either for (a,,...,@,.,,6-4,) or for (a,,...,4,-,, 6). So an instance 
of the subset-sum problem for n numbers can be reduced to two subproblems 
with n—1 numbers each. Building up a search tree based on this observation 
yields an O(2") algorithm (which basically enumerates all subsets of the a,). 

But looking at this tree more carefully, we sce that, at least if b is small 
compared with 2”, it has a crucial property: the same subproblem occurs on many 
branches! In fact, there are only nb distinct subproblems altogether: for each 
c <b and each m <n, the subset-sum problem with input a,,...,a,,,c. Imagine 
that the branches of the search tree “grow together” if the same subproblem 
occurs: we get a “search diagraph’” D. The nodes of this acyclic digraph are 
labelled with pairs (c, m), and there is an edge from (c, m) to (c’, m — 1) if cither 
c’=c ore’ =c—a,,. The subset-sum problem is solvable if and only if there is a 
dipath from (6, ) to (0,0). Such a dipath can be found (if it exists) in O(bn) time 
by searching D cither from (0,0) or from (6, n). 

Along the same lines, one can devise an algorithm for the knapsack problem 
with running time polynomial in b + Yi, (c,). 


Minimal triangulation of a convex polygon. Given a convex polygon P with n 


et oreo natant 
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vertices in the plane, we want to find a triangulation with minimal total edge 
length. (The length c,, of each edge ij is known.) - 

If the vertices of P are numbered consecutively | through n, take edge In and 
consider the vertex-i with which it forms a triangle in the triangulation. For a 
given /, it suffices to find optimal triangulations of. the two polygons with vertices 
1,...,iandi,...,n, respectively, which can be done independently, see fig. 5.1. 
So we have produced 2(n — 2) subproblems. 

If we are not careful, repeating this process could lead to exponentially many 
distinct subproblems. But note that if we choose the triangle containing the edge 
li to cut the polygon 1,...,é, then we get two subproblems corresponding to 
convex polygons having only one edge that is not an edge of P. 

In general, given two vertices i and j with i<j~—2, let f(i, j/) denote the 
minimum total length of diagonals triangulating the polygon with vertices (i, i + 
1,...,/). Then clearly f(i,£+ 2) =0 and 


fGJ) =min{, min, (fG, kK) + f(K, A) + ig + Cx} 
FEFVD +e pp AGI Nt 65-4} - (4) 


The answer to the original question is f(1,”). 

We can represent the computation by a ‘search digraph’? whose nodes 
correspond to all the polygons with vertex set ((,i+1,..., /), where 1<i, j <n, 
i<j—2. We set fi, j)=0 if j=i+2, and can use (4) recursively if j>i+ 2. 
There are O(n’) subproblems to solve, and each recursive step takes O(7) time. 
So we get an O(n’) algorithin. 


Steiner trees in planar graphs. Let G =(V, E) be a graph with edge lengths c, > 0, 
and let T CV be a sct of “terminals’’. A Steiner tree in G is a subtree of G that 
contains all nodes of T. The Steiner tree problem is the task of finding a shortest 


n-1 


2 


Figure 5.t. Optimum triangulation of a convex polygon. 
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Steiner tree. This problem is NP-hard in general, even for planar graphs. But in 
the case of a planar graph when all the terminal nodes are on one, say, on the 
outer face C, a shortest Steiner tree can be found by dynamic programming as 
follows. ‘ 

Let us first look at a minimum Steiner tree B. Pick any node vu of B and let B' 
be the union of some branches of B that are rooted at v and that are, in addition, 
consecutive in the natural cyclic order of the edges leaving v. Let T':=TNM 
V(B'). We observe the following (see fig. 5.2, where v is represented by a black 
circle): 
© There is a path PCC whose endnodes are terminals such that T' = V(P) T. 
@ B' is a minimum length Steiner tree with respect to the terminal set T’ U {v}. 
@v is on the outer face of the subgraph B’ U P. 

These observations motivate the following dynamic program for the solution of 
our Steiner tree problem. For every path PCC whose end nodes are terminals 
and every node v € V, we determine a shortest Steiner tree B’ with respect to the 
set of terminal nodes (V(P)M 7) U {v} with the additional requirement that v is 
on the outer face of B’ UP. 

If P consists of just one terminal then such a Steiner tree can be found by a 
shortest path calculation. 

Suppose that we have solved this subproblem for all nodes v € V and all paths 
PCC containing at most k terminal nodes. To solve the subproblem for some 
node v€V and a path PCC containing 4 +1 terminal nodes, we do the 


® terminals 


Steiner tree 
Figure 5.2. Steiner tree in planar graphs. 
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folowing. Let ¢,,...,¢,,, be the terminals contained in P in the natural order. 
For every node w € V and every two subpaths P,, P, of P, where P, connects f, to 
t, and P, connects t,,, to t,,,, 1<j =k, we solve the subproblems for w and P, 
and for w and P, to get two trees B, and B,. We also compute a shortest path Q 
from w to v. Among all the sets B, UB, UQ computed this way we choose the 
one with minimum length. This is an optimum solution of our subproblem for v 
and P. 

To get a minimum length Steiner tree, consider a path PCC that contains all 
terminal nodes and choose the shortest among all solutions of subproblems for v 
and P with vEV. 

This algorithm is due to Erickson et al. (1987) and is based on ideas of Dreyfus 
and Wagner. [t can be extended to the case when all terminals are on a fixed 
number of faces. 

There are many other non-trivial applications of the idea of dynamic program- 
ming; for example, Chvatal and Klincsek (1980) use it to design a polynomial 
time algorithm that finds a maximum cardinality subset of a set of 1 points in the 
plane that forms the vertices of a convex polygon (cf. chapter 17, section 7.2). 


Optimization on tree-like graphs. There are many NP-hard problems that are easy 
if the underlying graph is a tree. Consider the stable set problem in a tree T. We 
fix a root r and, for every node x, we consider the subtree T, consisting of x and 
its descendants. Starting with the leaves, we compute, for each node x, two 
numbers: the maximum number of independent nodes in 7, and in T, —x. lf 
these numbers are available for every son of x, then it takes only O(d(x)) time to 
find them for x. Once we know them for T., we are done. So a maximum stable 
set can be found in a linear time. 

Similar algorithms can be designed for more general “tree-like” graphs, ¢.g., 
series—parallel graphs. A general framework for ‘‘tree-like’ decompositions, 
developed by Robertson and Seymour in their “Graph minors” theory, leads to 
very general dynamic programming algorithms on graphs with bounded tree- 
width. See chapter 5 for the definition of tree-width and for examples of such 
algorithms. 


6. Augmenting paths 


In local search algorithms, we try to find very simple “local”? improvements on the 
current solution. There are more sophisticated improvement techniques that 
change the current solution globally, usually along paths or systems of paths. 
These methods are often called augmenting path techniques. Since they occur 
throughout this Handbook, we refrain from describing any of them here; let it 
suffice to quote the most important applications of the method of alternating 
paths: maximum flows and packing of paths, chapter 2; maximum matchings 
(weighted and unweighted), maximum stable sets in claw-free graphs, chapter 3, 
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edge-coloring, chapter 4; matroid intrersection matroid matching, and submodu- 
lar flows, chapter 11. 


7. Uncrossing 


Uncrossing is a technique to simplify complicated set-systems, while maintaining 
certain key properties. One can find many applications of the uncrossing 
procedure in this Handbook as a proof technique; it is applied in the theory of 
graph connectivity and flows (chapter 2), matchings (chapter 3, or Lovasz and 
Plummer 1986), and matroids (chapters 11, 30). It is worth pointing out, 
however, that uncrossing can be viewed as an algorithmic tool, that constructs, 
from a complicated dual solution, a dual solution with a tree-like structure. This 
way it is sometimes possible to derive an optimum integral dual solution from an 
optimum fractional dual solution. 

As an illustration, consider the problem of finding a maximum family of rooted 
cuts in a digraph G = (V, A) with root r such that every arc a occurs in at most /, 
of these cuts, where the /, = 0 are given integer values (“lengths”). (A rooted cut 
or r-cut, is the set of arcs entering S, i-e., with tail in V\S and head in S for some 
non-empty SCV, r€S; cf. chapter 30.) Assume that we have a fractional 
packing, i.e., a family ¥ of r-cuts and a weight w,, =0 for every D € ¥ such that 
ips. Wp SL, for every arc a (ellipsoidal or interior point methods, as well as 
averaging procedures, may yield such “fractional solutions”). As a consequence 
of Fulkerson’s optimum arborescence theorem (chapter 30, Theorem 5.7), we 
know that there exists an integer solution with the same value, i.e., a family of at 
least )I, w, r-cuts with the prescribed property. But how to find this? 

For each r-cut D in the digraph G, we denote by S(D) a set SCV\{r} such 
that D is the sct of edges entering §. Let # = {S(D): D € ¥}. Call two r-cuts D, 
and D, intersecting if all three sets S(D,) N S(WD,), S(D,)\S(D,) and S(D,)\S(D,) 
are non-empty. Assume that ¥ contains two intersecting cuts D, and D,, and let 
D’ and D" denote the r-cuts defined by S(D,)NS(D,) and S(D,)U S(D,), 
respectively. 

Decrease w,, and w,, by € and increase w,. and w,,. by ©, where «: 
min{wy,, Wy} (if, say, D’ docs not belong to ¥, then we add it to ¥& with 
Wp, =0). It is easy to check that this yields a new fractional packing with the 
same total weight. The family ¥ lost one member (one of D, and D,), and gained 
at most two new members (D’ and D"). If the new family contains two 
intersecting cuts, then we “uncross” them as above. It can be shown that the 
procedure terminates in a polynomial number of steps (see Hurkens et al. 1988 
for a discussion of this). 

When the uncrossing procedure terminates, the family # is nested, i.e., there 
are no intersecting pairs of cuts in ¥. Such a family has a tree structure; # can be 
obtained by selecting disjoint subsets of V\{r}, then disjoint subsets in these 
subsets, etc. It is not difficult to see that the number of members of # is at most 
2|v| — 
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Choose D € ¥ such that w,, is not an integer and S(D) is minimal. There is a 
unique cut D’ € ¥ such that S(D') D S(D) and S(D’) is minimal. Add « to w, 
and subtract ¢ from wy., where €:= min{[w,)| — wp, Wp}. It is easy to check 
using the integrality of @) that this results in a fractional packing with the same | 
valuc, and now cither w,, is an integer or w,.=0. After at most 21 —3 
repetitions of this shift, we get a fractional packing with all weights integral, which 
trivially gives the family as required. 


8. Linear programming methods 


A very successful way to solve combinatorial optimization problems is to translate 
them into optimization problems for polyhedra and utilize linear programming 
techniques. The theoretical background of this approach is surveyed in chapter 30 
where also many examples of the application of this method are provided. We will 
concentrate here on the implementation of the linear programming approach to 
practical problem solving, and on the use of linear programming in heuristics. 

We will assume that we have a combinatorial optimization problem with linear 
objective function like the travelling salesman, the max-cut, the stable set, or the 
matching problem. Let us also assume that we want to find a feasible solution of 
maximum weight. Typically, an instance of such a problem is given by a ground 
set E, an objective function c: E—>R and a set # C 2" of feasible solutions such 
as the set of tours, of cuts, of stable sets, or matchings of a graph. We transform ¥ 
into a set of geometric objects by defining, for each 1 EY, a vector x’ ER* with 
yl =1 if e€/ and x. =0 if e€I. The vector y’ is called the incidence (or 
characteristic) vector of 1. Now we set 


P(F):=conv{y’ ER |1E4}, 


i.e., we consider a polytope whose vertices are precisely the incidence vectors of 
the feasible solutions of our problem instance. Solving our combinatiorial 
optimization problem is thus equivalent to finding an optimum vertex solution for 
the following linear program 


maxc'x, x€P(F). (5) 


However, (5) is only a linear program “in principle” since the usual LP-codes 
require the polyhedra to be given by a system of linear equations and inequalities. 
Classical results of Weyl and Minkowski ensure that a set given as the convex hull 
of finitely many points has a representation by means of linear equations and 
inequalities (and vice versa). But it is by no means simple to find, for a 
polyhedron given in one of these representations, a complete description in the 
other way. By problem specific investigations one can often find classes of valid 
and even facet-defining inequalities that partially describe the polyhedra of 
interest. (Many of the known examples are described in chapter 30.) 

What does “finding” a class mean? The typical situation in polyhedral 
combinatorics is the following. A class of inequalities contains a number of 
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inequalities that is exponential in |£]. It is well-described in the sense that we can 
(at least) decide in polynomial time whether a given inequality, for a given 
instance, belongs to the class. This is a minimal requirement, we need more for a 
class to be really useful (see separation below). Also, the class should contain 
strong inequalities; best is when most of the inequalities define facets of P(F). 
Except for special cases, one such class (or even a bounded number of such 
classes) will not provide a complete description, i.e., P(.4) is strictly contained in 
the set of solutions satisfied by all these inequalities. 


The cut polytope. To give an cxample, Iet us discuss the max-cut problem. In this 
case a graph G=(V,E) is given (for convenience we will assume that G is 
simple), and we are interested in the convex hull of all incidence vectors of cuts in 
G, ie., 


CUT(G) := conv{y* ERT |WCV). 


This polytope has dimension |E|. For any edge e € E, the two trivial inequalities 
0<x, <1 define a facet of CUT(G) if and only if e is not contained in a triangle. 
About some other classes of facets, we quote the following results of Barahona 
and Mahjoub (1986). 


Theorem 8.1. Let G=(V, E) be a graph. 
(a) For every cycle CC E and every set F CC, |F| odd, the odd cycle inequality 


x(F)—x(C)\F):= 2 x.- Dx, <|FI-1 
CGF cEC\F 
is valid for CUT(G), it defines a facet of CUT(G) if and only if C has no chord. 
(b) For every complete subgraph K, = (W, F) of G, the K,-inequality 


= [55] 


is valid for CUT(G); it defines a facet of CUT(G) if and only if p is odd. 


Applications of the max-cut problem arise in statistical mechanics (finding 
ground states of spin glasses, see chapter 37) and VLSI design (via minimization). 
Both applications are covered in Barahona et al. (1988). But the max-cut problem 
comes up also in many other fields. Structural insights from different angles 
resulted in the discovery of many further (and very large) classes of valid and 
facet-defining inequalities. Studies on the embeddability of finite metric spaces, 
for instance, lead to the class of hypermetric inequalities; there are the classes of 
clique-web, suspended tree, circulant, path-block-cycle, and other inequalities. A 
comprehensive survey of this line of research can be found in Deza and Laurent 
(1991); cf. also chapter 41, section 2. 

There are a few special cases where it is known that some of the classes of 


Combinatorial optimization 1569 


inequalities suffice for a characterization of CUT(G). For example, setting 


PA(G):= {xER*|0<x, <1 for all e€ E, x(F) ~x(C\F) <|F|—1 for 
all cycles C C E and all FCC, |F| odd} , (6) 


Barahona and Mahjoub showed that CUT(G) = P,(G) holds if and only if G is 
not contractible to the complete graph K,. But for a general graph G, the union 


of all the known classes of inequalities does not provide a complete description of 
CUT(G) at all. 


Separation. Let us review at this point what has been achieved by this polyhedral 
approach. We started out with a polytope P(.%) and found classes of inequalities 
A,x<b,, A,x<b,,...,A,x<b,, say, such that all inequalities are valid and 
many facet-dcfining for P(%). The classes are huge and thus we arc unable to use 
linear programming in the conventional way by inputting all constraints. More- 
over, even if we could solve the LPs, it is not clear whether the results provide 
helpful information for the solution of our combinatorial problem. Although the 
situation looks rather bad at this point we have done a significant step towards 
solving hard combinatorial optimization problems in practice. We will now outline 
why. 

A major issue is to figure out how one can solve linear programs of the form 


maximize c'x 
ubjectto A,x<b 
S| 4] e 1 (7) 


A,x <b, 


where some of the matrices A, have a number of rows that is exponential in |E], 


and are only implicitly given to us. To formulate the answer to this question we 
introduce the following problem. 


Separation problem. Let Ax <b be an inequality system and y a_ vector, 
determine whether y satisfies all inequalities, and if not, find an inequality 
violated by y. 


Suppose now that we have a class of of inequality systems Ax = b. (Example: 
Consider the class consisting of all odd cycle inequalities for CUT(G), for each 
graph G.) For each system Ax <b, let p:=min((a,) + (8,)), where the mini- 
mum is taken over all rows a,x < B, of the system. We say that the optimization 
problem for & can be solved in polynomial time if, for any system Ax <b of & 
and any vector c, the linear program max{c'x|Ax <6} can be solved in time 
polynomial in g + (c), and we say that the separation problem for 4 can be 
solved in polynomial time if, for any system Ax <6 of # and any vector y, the 
separation problem for Ax <b and y can be solved in time polynomial in g@ and 


(y). 
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Theorem 8.2. Let o be a class of inequality systems, then the optimization 
problem for 3 is solvable in polynomial time if and only if the separation problem 
for & is solvable in polynomial time. 


For a proof and extensions of this result, see Grotschel et al. (1988). 

The idea now is to develop polynomial time separation algorithms for the 
inequality systems A,x<b,,..., A,x <b, in (7). It turns out that this task often 
gives rise to new and interesting combinatorial problems and that, for many hard 
combinatorial optimization problems, there are large systems of valid inequalities 
that can be separated in polynomial time. 

We use the max-cut problem again to show how separation algorithms, i.e., 
algorithms that solve the scparation problem can be designed. We thus assume 
that a graph G = (V, E) is given and that we have a vector yER”, 0<y, <1 for 
all e&@ E. We want to check whether y satisfies the inequalities described in 
Theorem 8.1. 

To solve the separation problem for the odd cycle inequalities of Theorem 8.1(a) 
in polynomial time, we define a new graph H=(V'UV", E’'UE"UE") that 
consists of two copies of G, say G' = (V’, E') and G” = (V", E”) and the following 
additional edges E"". For each edge uu € E we create the two edges u’v" and u"v’. 
The edges u’v' € E’ and uv" € E” are assigned the weight y,,, while the edges 
u'v", u"v' € E™ are assigned the weight 1 — y,,. For each pair of nodes u’, u" € W, 
we calculate a shortest (with respect to the weights just defined) path in H. Such a 
path contains an odd number of edges of £” and corresponds to a closed walk in 
G containing u. Clearly, if the shortest of these (u’, u”)-paths in H has length less 
than 1, there exists a cycle CC E and an edge set FCC, |F| odd, such that y 
violates the corresponding odd cycle inequality. (C and F are easily constructed 
from a shortest path.) If the shortest of these (u’, u")-paths has length at least 1, 
then y satisfies all these inequalities (see Barahona and Mahjoub 1986). 

Trivially, for p fixed, one can check all K,-inequalities in polynomial time by 
enumeration, but it is now known whether there is a polynomial time algorithm to 
solve the separation problem for all complete subgraph inequalities of Theorem 
8.1(b). In this case one has to resort to separation heuristics, i.c., algorithms that 
try to produce violated inequalities but that are not guaranteed to find one if one 
exists. 

It is a simple matter to show that the integral vectors in the polytope P,(G), 
see (7), are exactly the incidence vectors of the cuts of G. This shows that every 
integral solution of the linear program 


maximize c'x 
(i) O<x<1 
(ii) x satisfies all odd cycle inequalities of Theorem 8.1(a) 


(iii) x satisfies all K,-inequalities of Theorém 8.1(b) (8) 


is the incidence vector of a cut of G. In particular, the optimum value of (8) (or 
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any subsystem thereof) provides an upper bound for the maximum weight of a 
cut. 

Theorem 8.2 and the exact separation routine for odd cycle inequalities 
outlined above show that the linear program (8) (without system (iii)) can be 
solved in polynomial time. So an LP-relaxation of the max-cut problem can be 
solved in polynomial time that contains (in general) exponentially many 
inequalities facet-defining for the cut polytope CUT(G). The question now is 
whether this technique is practical and whether it will help solve max-cut and 
other hard combinatorial optimization problems. 


Outline of a standard cutting plane algorithm. Theorem 8.2 is based on the 
ellipsoid method. Although the algorithm that proves 8.2 is polynomial, it is not 
fast enough for practical problem solving. To make this approach usable in 
practice one replaces the ellipsoid method by the simplex method and enhances it 
with a number of additional devices. We will sketch the issues coming up here. We 
assume that, by theoretical analysis, we have found an LP-relaxation such as (7) 
of our combinatorial optimization problem. 


The initial linear program. We group the inequalities of (7) such that the system 
A,x <b, is not too large and contains those inequalities that we feel should be 
part of our LP initially. 

This selection is a matter of choice. In the max-cut problem, for instance, one 
would clearly select the trivial inequalities 0 <x <1. For the large classes of the 
other inequalities, the choice is not apparent. One may select some inequalities 
based on heuristic procedures. In the case of the travelling salesman problem, see 
(9.10) of chapter 30 and section 2 of this chapter, in addition to the trivial 
inequalities, the degree constraints x(8(v)) = 2 for all vu EV are self-suggesting. 
For the packing problem of Steiner trees (a problem coming up in VLSI routing), 
for example, a structural analysis of the nets to be routed on the chip was used in 
Grétschel et al. (1992a) to generate ‘reasonable’ initial inequalities. This 
selection helped to increase the lower bound significantly in the early iterations 
and to speed up the overall running time. 


Initial variables. For large combinatorial optimization problems the number of 
variables of the LP-relaxation may be tremendous. A helpful trick is to restrict 
the LPs to “promising” variables that are chosen heuristically. Of course in the 
end, this planned error has to be repaired. We will show later how this is done. 
For the travelling salesman problem, for instance, a typical choice are the 2 to 10 
nearest neighbors of any node and the edges of several heuristically generated 
tours. For a 3000 city TSP instance, the number of variables of the initial LP can 
be restricted from about 4.5 million to less than 10 thousand this way; see 
Applegate et al. (1994), Grétschel and Holland (1991), and Padberg and Rinaldi 
(1991) for descriptions of variable reduction strategies for the TSP. 

There are further preprocessing techniques that, depending on the type of 
problem and the special structure of the instances, can be applicd. These 
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techniques are vital in many cases to achieve satisfactory running times in 
practice. Particularly important are techniques for structurally reducing instance 
sizes by decomposition, for detecting logical dependencies, implicit relations, and 
bounds that can be used to eliminate variables or forget certain constraints 
forever. For space reasons we are unable to outline all this here. 


Cutting plane generation. The core of a cutting plane procedure is of course the 
identification of violated inequalities. Assume that we have made our choice of 
initial constraints and have solved the initial linear program max{c'x | A,x <b,}. 
In further iterations we may have added additional constraints so that the current 
linear program has the form 


maximize c'x 
subjectto Ax<b. 


We solve this LP and suppose that y is an optimum solution. If y is the incidence 
vector of a feasible solution of our combinatorial problem we are done. Otherwise 
we want to check whether y satisfies all the constraints in A,x <b,,...,A,x< 
b,. We may check certain small classes by substituting y into all inequalities. But, 
in general, we will run all the separation routines (exact and heuristic) that we 
have, to find as many inequalities violated by y as possible. It is a very good idea 
to use several different heuristics even for classes of inequalities for which exact 
separation algorithms are available. The reason is that exact routines typically find 
only a few violated constraints (the most violated ones), while separation 
heuristics often come up with many more and differently structured constraints. 

To keep the linear programs small one does also remove constraints, for 
instance those that are not tight at the present solution. It is sometimes helpful to 
keep these in a “pool” since an optimum solution of a later iteration might violate 
it again, and scanning the pool might be computationally cheaper than running 
elaborate separation routines (see Padberg and Rinaldi 1991). 

In the initial phase of a cutting plane procedure the separation routines may 
actually produce thousands of violated constraints. It is then necessary to select 
‘good ones” heuristically, again, to keep the LPs at a manageable size, see 
Grotschel et al. (1984) for this issue. 

Another interesting issue is the order in which exact separation routines and 
heuristics are called. Although that may not seem to be important, running time 
factors of 10 or more may be saved by choosing a suitable order for these and 
Strategies to give up calling certain separation heuristics. An account of this 
matter is given in Barahona et al. (1988). 

There are more aspects that have to be considered, but we are unable to cover 
all these topics here. It is important to note that, at the present state of the art, 

_ there are still no clear rules as to which of these issues are important or almost 
irrelevant for a combinatorial optimization problem and its LP relaxation 
considered. Many computational experiments with data from practical instances 


of realistic sizes are necessary to obtain the right combination of methods and 
“tricks’’. 
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Pricing variables. In the cutting plane procedure we have now iteratively called 
the cutting plane generation methods, added violated inequalities, dropped a few 
constraints and repeated this process until we either found an optimum integral 
solution or stopped with an optimum fractional solution y for which no violated 
constraint could be found by our separation routines. Now we have to consider 
the ‘forgotten variables”. This is easy. For every initially discarded variable we 
generate the column corresponding to the present linear constraint system and 
compute its reduced costs by standard LP techniques. If all reduced costs come 
out with the correct sign we have shown that the present solution is also optimum 
for the system consisting of all variables. If this is not the case we add all (or 
some, if there are too many) of the variables with the wrong sign to our current 
LP and repeat the cutting plane procedure. In fact, using reduced cost criteria one 
can also show that some variables can be dropped because they can provably 
never appear in any optimum solution or that some variables can be fixed to a 
certain value. 


Branch-and-cut. This process of iteratively adding and dropping constraints and 
variables may have to be repeated several times before an optimum solution y of 
the full LP is found. However, for large instances this technique is by far superior 
to the straightforward method of considering everything at once. If the final 
solution y is integral, our combinatorial optimization problem is solved. If it is not 
integral, we have to resort to branch-and-bound, see section 4. There are various 
ways to mix cutting plane generation with branching, to use fractional LP- 
solutions for generating integral solutions heuristically, etc. It has thus become 
popular to call the whole approach described here branch-and-cut. 

Clearly, this tremendous theoretical and implementational effort only pays if 
the bounds for the optimum solution value obtained this way are very good. 
Computational experience has shown that, in many cases, this is indeed so. We 
refer the interested readers to more in-depth surveys on this topic such as 
Grétschel and Padberg (1985), Padberg and Grétschel (1985), and Jiinger et al. 
(1995) for the TSP, or to papers describing the design and implementation of a 
cutting plane algorithm for a certain practically relevant, hard combinatorial 
optimization problem. These papers treat many of the issues and little details we 
were unable to cover here. Among these papers are: Applegate et al. (1994), 
Gr6tschel and Holland (1991), Padberg and Rinaldi (1991) for the TSP (the most 
spectacular success of the cutting plane technique has certainly been achieved 
here); Barahona et al. (1988) for the max-cut problem with applications to ground 
states in spin glasses and via minimization in VLSI design; Grétschel et al. (1984) 
for the linear ordering problem with applications to triangulation of input-output 
matrices and ranking in sports; Hoffman and Padberg (1993) for the set 
partitioning problem with applications to airline crew scheduling; Grétschel and 
Wakabayashi (1989) for the clique partitioning problem with applications to 
clustering in biology and the social sciences; Grétschel et al. (1992b) for certain 
connectivity problems with applications to the design of survivable telecommuni- 
cation networks; Grotschel et al. (1992a) for the Steiner tree packing problem 
with applications to routing in VLSI design. 
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The LP-solvers uscd in most of these cases are advanced implementations of 
the simplex algorithm such as Bixby’s CPLEX or IBM’s OSL. Investigations of 
the use of interior point methods in such a framework are on the way. A number 
of important issues like addition of rows and columns and postoptimality analysis, 
warm starts, etc. are not satisfactorily solved yet. But combinations of the two 
approaches may yield the LP-solver of the future for this approach. 


Linear programming in heuristics. So far, we have used linear programming as a 
dual heuristic, to obtain upper bounds on (say) the maximum value of an integer 
program. But solving the lincar relaxation of an optimization problem also 
provides primal information, in the sense that it can be used to obtain a (primal) 
heuristic solution. 

Of course, solving the linear relaxation of an integer linear program, we may be 
lucky and get an integer solution right away. Even if this docs not happen, we 
may find that some of the variables are integral in the optimum solution of the 
linear relaxation, and we may try to fix these variables at these integral values. 
This is in general not justified; a notable special case when this can be done was 
found by Nemhauser and Trotter (1974), who proved the following. Consider a 
graph G =(V, E) and the usual integer linear programming formulation of the 
stable set problem: 


maximize >, Xx; 
ieV 


subjectto x,2=0(@EV) 
X;+x,S1(@EE) 
x integral . (9) 


Let x* be an optimum solution of the lincar relaxation of this problem. Then 
there exists an optimum solution x** of the integer program such that x* = x** 
for all § for which x* is an integer. 

In general, we can obtain a heuristic primal solution by fixing those variables 
that are integral in the optimum solution of the linear relaxation, and rounding 
the remaining variables “appropriately”. Properties of this heuristic were studied 
in detail by Raghavan and Thompson (1987). However, it seems that this natural 
and widely used scheme for a heuristic is not sufficiently analyzed. Here we 
discuss some results where linear programming combined with appropriate 
rounding procedures gives a provably good primal heuristic. 


A polynomial approximation scheme for bin packing. The following polynomial 
time bin packing heuristic, due to Fernandez de la Vega and Lueker (1981), has 
asymptotic performance ratio 1+ €, where ¢>0 is any fixed number. A more 
refined application of this idea gives a heuristic that packs the weights into 
Koo + O(log7(K api) bins (Karmarkar and Karp 1982). 

First, we solve the following restricted version. We are given integers k, m > 0, 
weights 1/k<a,<-+--<a, <1 and a multiplicity n, for each weight a,. Let k,,, 
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be the minimum number of bins into which 1, copics of a, (j= 1,...,m) can be 
packed. Then we can pack the weights into k,,, + m bins, in time polynomial in 
and ("; os kK): 

To obtain such a packing, let us first generate all possible combinations (with 
repetition) of the given weights that fit into a single bin. Since each weight is at 
least 1/k, such a combination has at most k elements, and hence the number of 
different combinations is at most ("{*), and they can be found by brute force. 
Let T,,..., Ty be these combinations. Each T, can be described by an integer 
vector (t,t, .++,f,), Where f; is the number of times weight a, appears in 
combination i. 

Consider the following linear program: 

N 
minimize >, y; 

i=l 
subjectto y, 20 


N 
2 ty, 2m, (F= 1s. m), (10) 


Let Y denote the optimum value. Every packing of the given weights into bins 
gives rise to an (integral) solution of this linear program (y, is the number of 
times combination 7; is used), hence 


Kot = Y : 
On the other hand, let y* be an optimum basic solution of (10), and consider 
[y*] bins packed with combination 7;. Since at most m of the y* are non-zero, 
we get a total of )}, [yt]<Y+m <k,,, +m bins, which clearly accommodate 
the whole list. 


Now let 0<x,< Sx, S01 Sx, 1 be an arbitrary list 4 of weights, and let 
0<e<1 be also given. Set os = Ty x,, and define / by x,< e/2< <4 (set /:= O if 
x, 26/2). Set a,:=x,,;, (= ,m), where h:= [ew] and m:= [(n—I)/h]. 


Consider a list L’ consisting fe h copies of each a, and n —1 — hm copies of 1. Let 
koe be the minimum numbcr of bins into which L’ can be packed; by the solution 
of the restricted problem described above, we can pack L’ into Kot +m bins in 
polynomial time. Trivially, we get from this a packing of the weights x,,,,...,2, 
into ki, +m bins. The remaining (small) weights x,,...,x, are packed by 
FIRST-FIT into the slacks of these bins and, if necessary, into new bins. 

To compare the number k,,,,, of bins used this way with k,,,, we distinguish 
two cases. If, in the last step of FIRST-FIT, we had to open a new bin, then every 
bin (except possibly the last one) is filled up to 1 — «/2, and hence 


I 
hewr SE + TQ 


Since w is a trivial lower bound on k,,,, this shows that 


k 


Kyeur a <(1 + e)k 


opt + 
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So assume that we do not open a new bin in the last phase, and hence we use at 
most k,,, +m bins. To compare this with k,,,, consider an optimum packing of 
L. Then from the list L’, the A copies of a, can be put in the place of 
Xtaneis+++>Xr42n> the h copies of a, can be put in the place of x,,5;415 ++ - > Xra3y 
etc. At the end we are left with h weights, which we can accommodate using h 
new bins. Hence 


Kop Kop FA, 
and so 
Kreur ont t+ms Kom th+m 
Since 
h<ew+1<ek,,+1, 
and 
n-l ow 2 
ma <TR (ew) eerie O(1), 


this proves that the asymptotic performance ratio is at most 1+ € as claimed. 


Blocking sets in hypergraphs with small Vapnik—Cervonenkis dimension. The 
following discussion is based on ideas of Vapnik and Cervonenkis, which have 
become very essential in a number of areas in mathematics (statistics, learning 
theory, computational geometry). Here we use these ideas to design a random- 
ized heuristic for finding a blocking set in a hypergraph. 

Let (V, #) be a hypergraph and consider an optimum fractional blocking set of 
#, i.e., an optimum solution x* of the linear program 


minimize ny X; 
ieV 
subjectto x, 2=0(@EV) 
> x,21(EEXR). (11) 
i€E 
The optimum value of program (11) is the fractional blocking number 7*:= 
7*(#), which can serve as a lower bound on the covering (or blocking) number 
1(#). Now we use this linear program to obtain a heuristic solution. 

Consider an optimum solution x, and define p; = x;/t*. Then (p;:i GV) can be 
viewed as a probability distribution on V, in which every edge EGE # has 
probability at least 1/7*. Let us generate nodes v,,v,,... independently from 
this distribution, and stop when all edges are covered. It is easy to see that with 
very large probability we stop in a polynomial number of steps, so this procedure 
is indeed a (randomized) blocking set heuristic, which we call the random node 
heuristic. Let ¢,,,, be the size of the blocking set produced (this is a random 
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variable). What is the expected performance ratio of the random node heuristic, 
i-e., the ratio E(t,.,,,)/7(#)? 

It is clear that if we consider a particular edge E, then it will be hit by one of 
the first kr* nodes with probability about 1—1/e*. Hence if k >In|%|, then the 


probability that every edge is hit is more than }. Hence we get the inequality 


rear < (In] #])r* < (In| 4|)r , 


i.e., we obtain the bound In|#| for the performance ratio of the random node 
heuristic. This is not interesting, however, since the greedy heuristic does better 
(see Theorem 2.2). 

Adopting a result of Haussler and Welz! (1987) from computational geometry 
(which in turns is an adaptation of the work of Vapnik and Cervonenkis in 
statistics, see Vapnik 1982), we get a better analysis of the procedure. The 
Vapnik—Cervonenkis dimension of a hypergraph (V, #) is the size of the largest 
set SCV such that for every TCS there is an EG # such that T=SNE. 


Theorem 8.3. Let # be a hypergraph with Vapnik—Cervonenkis dimension d and 
fractional blocking number 1*. The expected size of the blocking set returned by 
the random node heuristic is at most 16dr* log(dr*). 


Proof. We prove that if we select N:= [8dr* log(dr*)] nodes from the dis- 
tribution p, then with probability more than 4, every edge of # is met. Hence it 
follows easily that E(¢,,,,) <2N. 

The proof is not long but tricky. Choose N further nodes vy,,,..., Vay 
(independently, from the same distribution). Set s = N/(27*). Assume that there 
exists a set E © # such that EM {v,,...,u,} =. Chebychev’s inequality gives 
that for any such edge EE #, 


Prob(IEN {uyiy,-- > Yayt}=s) >t. 
Hence we obtain 


Prob(AE: EN {v,,.--, Uy} =O IEN {uyiys--- Yantl>s) 
> 4 Prob(SE: EN {v,,..., vy} =). 


We estimate the probability on the left-hand side from above as follows. We can 
generate a (2N)-tuple from the same distribution if we first generate a set S of 2N 
nodes as before, and then randomly permute them. For a given E € # that meets 
S in at least s elements, the probability that after this permutation E avoids the 
first half of S is at most 


Cw *) 5 N -s 
CR) <(1-35) <e*?. 


We do not have to add up this bound for all E, only for all different intersections 
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EQS. The number of sets of the form EMS is at most 


re ) te tt <eny, 


by the Sauer-Shelah theorem (see chapter 24, section 4). Hence the probability 
that there is an edge E € # that meets S in at least s elements but avoids the first 
half is at most (2N)“e °? < 1/4, 

Hence 


Prob(aAE: EN {u,,..., vy} =B) <3. O 


With some care, the upper bound can be improved to (essentially) 
O(d7* log 7*), which is best possible in terms of these parameters, see Komlos et 
al. (1992). 


Approximating a cost-minimal schedule of parallel machines. Machine scheduling 
problems arise in hundreds of versions and are a particular ‘‘playground” for 
approximation techniques. We outline here an LP-based heuristic for the follow- 
ing problem of scheduling parallel machines with costs (this problem is also called 
the generalized assignment problem). Suppose that we have a set J of n 
independent jobs, a set M of m unrelated machines, and we want to assign each 
job to one of the machines. Assigning job j to machine i has a certain cost c,, and 
takes a certain time p,,. Our task is to find an assignment of jobs to machines such 
that no machine gets more load than a total of T time, and the total cost does not 
exceed a given bound C, i.e., we look for a job assignment with maximum time 
load (makespan) at most T and cost at most C. 

If all the p,, are the same, then this is a weighted bipartite matching problem, 
and so can be solved in polynomial time. However, for general p,,, the problem is 
NP-hard. Since there are two parameters (T and C), there are several ways to 
formulate what an approximate solution means, and there are various algorithms 
known to find them. Each of these is based on solving a linear relaxation of the 
problem and then “rounding” the solution appropriately; this technique was 
introduced by Lenstra et al. (1990). A combinatorially very interesting ‘‘round- 
ing” of the solution of the linear relaxation was used by Shmoys and Tardos 
(1993), which we now sketch. 

Consider the following linear program: 


mon 
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x, =0, ifp,>T,i=1,...,m,j=1,...,n. (12) 


Clearly, every integral solution y of (12) with cost c'y <C provides a feasible 
solution of the generalized assignment problem, and thus, (12) is a natural 
LP-relaxation of the generalized assignment problem. (The explicit inclusion of 
the last condition plays an important role in the approximation algorithm.) Let us 
replace the right hand side T of the first m inequalities of (12) by 27 and let us 
denote this new LP by (12’). Now Shmoys and Tardos prove the following. /f 
(12) has a ( possibly fractional) solution x* with cost c*:=c'x* then (12') has a 
0/1-solution with cost c*. In other words, if the LP (12) has a solution with cost at 
most C, then there is an assignment of jobs to machines with the same cost (at 
most C) and makespan at most 2T. 

The trick is, of course, in “rounding” the real solution x*. This is done by using 
x* to construct an auxiliary bipartite graph and then finding a minimum cost 
matching in this graph (cf. section 9), which then translates back to an assignment 
of cost at most C and makespan at most 27. These details must be omitted here, 
but note that the “rounding” involves a nontrivial graph-theoretic algorithm. 


9. Changing the objective function 


Consider an optimization problem in which the objective function involves some 
“weights”. One expects that if we change the weights ‘‘a little’, the optimum 
solutions do not change, or at least do not change ‘“‘too much”’. It is surprising 
how far this simple idea takes us: it leads to efficient algorithms, motivates linear 
programming, and is the basis of fundamental general techniques (scaling, 
Lagrangean relaxation, strong polynomiality). 


Kruskal’s algorithm revisited. Let G = (V, E) be a connected graph and c: V-> Z, 
the length function of its edges. We want to find a shortest spanning tree. Clearly, 
adding a constant to all edges does not change the problem in the sense that the 
set of optimum solutions remains the same. Thus, we may assume that the lengths 
are non-negative. 

Now let us push this idea just a bit further: we may assume that the shortest 
edge has length 0. Then it is easy to see that shrinking this edge to a single node 
does not alter the length of the shortest spanning tree. We can shift the 
edge-weights again so that the minimum length of the remaining edges is 0; 
hence, we may contract another edge, etc. 

It is easy to see that this algorithm to construct a shortest spanning tree is 
actually Kruskal’s algorithm in disguise: the first edge contracted is the shortest 
edge; the second, the shortest edge not parallel to the first, etc. (Or is greediness 
a disguise of this argument?) 


Minimum weight perfect matching in a bipartite graph. Given a complete bipartite 
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graph G =(V, E) with bipartition (U, W), where |U|=|W|, and a cost function 
w: EZ, we want to find a perfect matching M of minimum weight. 

The idea is similar to the one mentioned in the previous section. By adding the 
same constant to each weight, we assume that all the weights are non-negative. 
But now we have more freedom: if we add the same constant to the weights of all 
edges incident with a given node, then the weight of every perfect matching is also 
shifted by the same constant, and so the set of optimum solutions does not 
change. Our aim is to use this transformation until a perfect matching with total 
weight 0 is obtained; this is then triviually optimal. 

Let G, denote the graph formed by edges with weight 0. Using the unweighted 
bipartite matching algorithm, we can test whether G, has a perfect matching. If 
this is the case, we are done; else, the algorithm returns a set XY C U such that the 
set of neighbors N,, (X) of X is smaller than X. If ¢ is the minimum weight of any 
edge from X to W\N,, , we add « to all the edges out of N,, (X) and —e to all the 
edges out of X. This transformation preserves the values of the edges between X 
and N,, (X), and creates at least one new node connected to X by a 0-edge. Any 
pentect ‘matching changes its weight by the same value ~e|X| + e|N;; (X)| <0. 

It remains to show that this procedure terminates, and to estimate the number 
of iterations it takes. One way to show this is to remark that at each iteration, 
either the maximum size of an all-0 matching increases, or it remains the same, 
but then the set X returned by the unweighted bipartite matching algorithm (as 
described in chapter 3) increases. This gives an O(n’) bound on the number of 
iierations. 

One can read off from this algorithm Egervary’s min-max theorem on weighted 
bipartite matchings (see chapter 3 for extensions of the algorithm and the theorem 
to non-bipartite graphs). 


Theorem 9.1. [f G=(V,E) is a bipartite graph with bipartition (U,W), where 
|U|=|W|, and w: EZ is a cost function, then the minimum weight of a perfect 
matching is the maximum of ¥),<) 7,, where 7: V— Z is a weighting of the nodes 
such that 1, + 7, <c;, for all ij € E. 


Optimum arborescences. Given a digraph G=(V,A) and a root r&V, an 
arborescence is a spanning tree whose edges are oriented away from r. Let us 
assume that G contains an arborescence with root r. (This is easily checked.) Let 
l: AZ be an assignment of “lengths” to the arcs. The optimum arborescence 
problem is to find an arborescence of minimum length (see also chapter 30). An 
optimum arborescence can be found efficiently by the following algorithm due to 
Edmonds (1967a). 

Again, we may assume that the lengths are non-negative, since this can be 
achieved by adding a constant to every arc length. 

Consider all the arcs of G going into some node v ¥ r. Any arborescence will 
contain exactly one of these arcs. Hence we may add a constant to the length of 
all the arcs going into v without changing the set of optimum arborescences. 

For every v # r, we add a constant to the arcs going into v so that the minimum 
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length=0 length>0O 


Figure 9.1. The optimum arborescence algorithm. 


length of arcs entering any given vertex is 0. Consider the subgraph of all the arcs 
of length 0. If there exists an arborescence contained in this subgraph, we are 
done. Otherwise, there must be a cycle of O-arcs (fig. 9.1). 

We contract this 0-cycle, to get the graph G’. It is easy to check that the 
minimum length of an arborescence in G’ is equal to the minimum length of an 
arborescence in G. Thus we can reduce the problem to a smaller problem, and 
then proceed recursively. One reduction requires O(m) time, so this leads to an 
O(mn) algorithm. 

Similarly as in the case of the weighted bipartite matching algorithm, we can 
use this algorithm to derive Fulkerson’s optimum arborescence theorem (chapter 
2, Theorem 6.4). 

More involved applications of the idea of shifting the objective function without 
changing the optimum solutions include the out-of-kilter method (see chapter 2, 
section 5). 


Scaling 1: From pseudopolynomial to polynomial. Consider a finite set E and a 
collection ¥ of subsets of E. (in the cases of interest here, ¥ is implicitly given 
and may have size exponentially large in |E|, e.g., the set of all Eulerian 
subgraphs of a digraph). Let a weight function w: E— Q be also given. Our task 
is to find a member X € ¥ with maximum weight w(X):= Lecy we). 

Let us round each weight w(e) (e € E) to the nearest integer. Does this change 
the set of optimum solutions? On course, it may; but in several situations, 
connections between the original and the rounded problem can be established so 
that solving the rounded (and, sometimes, simpler) problem helps in the solution 
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of the original. Before rounding, we may of course multiply cach w(e) by the 
same positive scalar; combined with rounding, this becomes a powerful technique. 
It was introduced by Edmonds and Karp (1972) to show the polynomial time 
solvability of the minimum cost flow problem. Since then, scaling has become one 
of the most fundamental tools in combinatorial optimization, in particular in flow 
theory (see, e.g., Goldberg et al. 1990). 

We illustrate the method on a simple, yet quite general example. 


Theorem 9.2. Let YW be a family of hypergraphs and consider the optimization 
problem max{w(X): XEF)} for members (E,F¥)EW and objective functions 
w:E-—->Z,. Assume that there exists an “augmentation”, i.e., an algorithm that 
checks whether X € F is optimal and if not, returns an X' © ¥ with w(X') > w(X), 
also assume that the augmentation algorithm runs in time polynomial in (w). Then 
the optimization problem can be solved in time polynomial in (w). 


Proof. Note that a pseudopolynomial algorithm for this problem is obvious: start 
with any X€¥ and augment until optimality is achieved. The number of 
augmentations is trivially bounded by w(E). (Another obvious bound is 2\Fly It is 
easy to construct examples where this trivial algorithm is not polynomial. 

To achieve this in polynomial time, let k:~ max, (fog w(e){, and define new 
objective functions w,:= |w/2"~’]. We solve the optimization problem for the 
objective function w,, then for w,,..., finally for w, = w. Since w, is 0/1-valued, 
we can apply the pseudopolynomiial algorithm to find the optimizing sct. 

Assume that we have an optimizing set X, for w,. This is of course also optimal 
for 2Wi, which is very close to w,,,: we have, for each eE E, 


2w,(e) < w,,,(e) = 2we) + 1. 
Hence, we have for any set X € F, 
Wa ((X) S20 (X) +n <2w(X) tnsw,, (X)+a. 


Thus, the set X; is almost optimal for the objective function w,,,, and the trivial 
algorithm starting with X, will maximize w,,, in at most # iterations. Therefore, w 
will be maximized in a total of O(nk) iterations. O 


As an example, consider the problem of finding an Eulerian subdigraph of 
maximum weight in a directed graph D=(V, A) with arc weights w,. Then 
(A, ¥), where ¥:={CCA: C Eulerian}, is a hypergraph. To apply Theorem 
9.2 we have to design an augmentation subroutine. Given some Eulerian 
subdigraph C we construct an auxiliary digraph D,. by reversing the arcs in A\C 
and changing the signs of the weights on these edges. C is not a maximum weight 
Eulerian digraph if and only if D, contains a directed circuit of negative total 
weight. Such a circuit can be found in polynomial time by shortest path 
techniques. 
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For more involved applications of these scaling techniques see chapter 2, 
section 5. 


Scaling II: From polynomial to strongly polynomial. Strong polynomial solvabili- 
ty of a problem is often much more difficult to prove than polynomial solvability; 
for example, it is not known whether linear programs can be solved in strongly 
polynomial time. It is therefore remarkable that Frank and Tardos (1987) showed 
that, for a large class of combinatorial optimization problems, polynomial 
solvability implies strong polynomial solvability. (see also chapter 30). 


Theorem 9.3. Let Y be a family of hypergraphs and assume that there exists an 
algorithm to find max{)).<y w(e): X © ¥} for every (E, ¥)EW and w:E->Z in 
time polynomial in (w). Then there exists a strongly polynomial algorithm for this 
maximization problem. 


Proof. The goal is to find an algorithm in which the number of arithmetic 
operations is bounded by a polynomial in n =|E|, and does not depend on (w) 
(we also need that the numbers involved do not grow too wild, but this is easy to 
check). So the bad case is when the entries of w are very large. Frank and Tardos 
give an algorithm that replaces w by an integer vector w' such that every entry of 


w’ has at most O(n’) digits, and w and w’ are maximized by the same members of 
F. 


The key step is the construction of the following diophantine expansion of the 
vector w: 


w=Au,tAju, te +A,u 


won 


: : 2 . 
where u,,...,4,, are integral vectors with 1 < |{u,{{,, <4", and the coefficients A, 
are rational numbers that decrease very fast: 


1 
Anal <Sopo AA OG td). 
ve aliea ge 


Such an expansion can be constructed using the simultaneous diophantine 
approximation algorithm (see chapter 19). This expansion has the property that 
for any two sets X,Y CE, 

wW(X)=H(Y) © u(X)<u(Y) fori=l,....n. 

Now letting 

w':=8" 'u, + 8", + anil +5548" u 
we have for any two sets X¥, YC E, 

WAX)=W(Y) OS w(X)<w'(Y). 


Thus w and w’' are optimized by the same sets, and we can apply our polynomial 
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time algorithm to maximize w’. Since (w’) = O(n), this algorithm will be 
strongly polynomial. O 


Applying the result to our previous example, we obtain a strongly polynomial 
algorithm for the maximum weight Eulerian subdigraph problem. More generally, 
this technique yields strongly polynomial algorithms, among others, for lincar 
programs with {—1, 0, 1}-matrices (e.g., for the minimum cost flow problem). 

Poljak (1993) applied an even more general version of scaling to show that the 
single exchange heuristic of the max-cut problem is strongly polynomial for cubic 
graphs (while exponential for 4-regular graphs). Let G =(V, E) be a graph and 
c: E-— 2, a weighting of its edges. The idea is that the run of the heuristic is 
determined if we know, for each node uv and each partition of the edges incident 
with v into two classes, which class has larger weight. So we consider a family of 
inequalities, each of which is of the type 


Xp TXB OF x, 4%, +183, (13) 


(where i, j and k are three edges adjacent to a node). We know that this system 
has a solution (the original weights). Poljak proves that then the system has an 
integral solution with | <|x,|<2|V|—1 for all i. Replacing the original weights 
with these new weights, the single exchange heuristic runs as before, but now it 
clearly terminates in O(|V|”) time. 


Scaling II: Heuristics. Recall from section 5 that the 0/t-knapsack problem is 
NP-hard, but it is polynomially solvable if the weight coefficients a, are given in 
unary notation. This fact was combined with a scaling technique by Ibarra and 
Kim (1975) to design a fully polynomial approximation scheme for the knapsack 
problem. 

Fix any ¢>0. We may assume that c,>c,2-:-2Cc,. Let C,,, denote the 
optimum value of the knapsack, and let C be an upper bound on C,,,,; a good 
value for C can, e.g., be found by running the greedy heuristic for the knapsack 
problem, for which Theorem 2.7 gives 


CK <= 
FS Con SC. 


Let 0<m <n be the largest index such that c,, > eC/4, and define 
aed 
¢,= : 
' Le’ 
Clearly ¢,<8/e’, and so these numbers are polynomially bounded in 1/e. 
The idea is to solve a knapsack problem omitting the ‘small’ weights and 
replacing the “big” weights c,,...,c,, by their unary approximations C,,..., C,,- 


Then we use the ‘small’ weights to fill up greedily as much of the remaining 
space as possible. 


For every integral value d withO<d< 8/e”, we determine a solution x’ of the 
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knapsack problem, by solving the following auxiliary optimization problem: 


m 
minimize > a,x 


i 


subject to Cx,=d, 


! 
x,€ (0, 1},f=1,...,m. (14) 


This is basically a subset-sum- problem with a linear objective function and can be 
solved, by the same dynamic programming argument, in polynomial time. (In 
fact, we get the optimum solution for all values of d in a single run, which takes 
O(n/e’) time for the execution of all problems (14).) Let x7,...,x4 be the 
solution found (if any exists). 

Now we choose the remaining variables. These variables x,,,,,.-.,*, must 
satisfy the following constraints: 


n m 
d 
oy a,x, Sb — > ax; , 
ist 


i=m+l 
x, E {0,1}, (15) 


and we want to maximize 


> ¢x;. 
i-m+l 
If the right-hand side in (15) is non-negative, then this is just another (auxiliary) 
knapsack problem, which we solve by the greedy algorithm in O(7) time for each 


d (thus using O(n/e”) time in total). Let x4,,,...,x4 be the solution of the 


knapsack obtained (if it exists, i.c., if (15) has non-negative right-hand side). 


Theorem 9.4. For at least one d with 0<d <8/e’, the solution (x7, ... , x“) exists, 
and 


6X) = (1— eB) Co - 
i=l 


Proof. Let y,,...,y, be a (true) optimum solution of the original knapsack 
problem, and consider the value 


Clearly 


Fix this choice of d. Then trivially x/,... x4, exists, and by their optimality for 
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the auxiliary subset-sum problem (14), we have 


m 


> axi<> ay, <> ay, <b. 
i=t i 


i=1 isl 


Thus for this d, (15) has non-negative right-hand side and the solution 
(x4,...,x4) exists. Moreover, 


m 2 m 2 m 
Sees" ee ye 
aI ma 8 ied) 8 I ifi 


2 (16) 
m eC m 
= > CY; 8 yi 
=1 i=} 
Here 
m 1 mn 4 : 
Ly — Vays C Com» 
and so 
m m E 
DF Gx. 2 2 Ci 5 Con - 
inj i=l 
Furthermore, observe that (y,,,,.--..> ¥,) is a solution of (15), and hence by 


Theorem 2.7, , 


n n na 


E 
2X cate De Gy Cw ® De ay Ze 
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Thus 


ma 
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Lagrangian relaxation. Consider the (symmetric) travelling salesman problem 
again. For any vertex v, every tour uses two edges adjacent to v. Hence if we add 
the same constant to the length of every edge incident with v, we shift the value 
of every tour by the same number, and hence the problem remains essentially 
unchanged. By doing so for every vertex, we may bring the problem to a nicer 
form. 

So far, this is the same idea as in the weighted bipartite matching algorithm 
above. Unfortunately, this does not lead to a complete solution; we cannot in 
general obtain an all-0 tour by shifting edge-weights like this. But we may use this 
method to improve dual heuristics. We have seen that a minimum length of a 
1-tree is an easily computable lower bound; let us shift lengths so that this lower 
bound is maximized. This way we obtain a very good dual heuristic due to Held 
and Karp (1970). 

It is not immediate how this new optimization problem can be solved. To 
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describe a method, recall from matroid theory (chapter 11) that the convex hull of 
1-trees with vertex set V is described by the constraints 


x,20, 5. & 
X(A)Sr(A), ACE, . (17) 
x(E)=n, 


Here r(A) is the rank in the matroid whose bases are the 1-trees: if c(A) is the 
number of connected components in (V, A), then 


Mos n—c(A)+1, if A contains a circuit , 
BO) ln c(A)=|A|, if A contains no circuits . 


If we want to restrict the feasible solutions to allow only tours, a natural step is to 
write up the degree constraints: 


> x,=2 (EV). (18) 


jeVv\ti} 


The objective function is 


minimize > c(e)x, . 
: 


The integral solutions of (17) and (18) are exactly the tours; if we drop the 
integrality constraints, we obtain a relaxation. 

While it is trivial to minimize any objective function subject to constraints (17) 
using the greedy algorithm, constraints (18) spoil this nice structure. So let us get 
rid of the constraints (18) by multiplying them by appropriate multipliers A,, 
adding their left-hand sides to the objective function, and omitting them from the 
constraint set. Note that this leads to shifting the lengths of edges at nodes, as 
described above. 

For any fixed choice of the multipliers, adding the left-hand sides of an equality 
constraint to the objective function does not change the problem (the objective 
function is shifted by the right-hand side); but then, dropping the constraint may 
decrease the optimum value. Can we choose the multipliers so that the optimum 
does not change? The answer is yes, and it is worth formulating the generalization 
of the Duality theorem of linear programming that guarantees this. 


Theorem 9.5. Consider a linear program with constraints split into two classes: 


minimize cx 
subjectto Ax 2a 
Bx=b. (19) 


Then the optimum value of this program is the same as the optimum of the 


1588 M. Grétschel and L. Lovasz 


following min-max problem: 
max {min {(c" —y'B)x|Ax2a}t+y'b, y=0}. (20) 


Similarly as in the Duality theorem, one can allow equations among the 
constraints, and then the corresponding multipliers y are unconstrained. 
For any particular choice of the vector y, the minimum 


$(y):=min{(c" — y"B)x| Ax >a} + y'b 


is a lower bound on the optimum value of (19). It is not difficult to see that the 
function ¢, called the Lagrange function of (19), is a concave function, and hence 
various methods (subgradient, ellipsoid) are available to compute its maximum. 
Note that as long as we are only using this as a dual heuristic, we do not have to 
solve this problem to optimality: any reasonable y provides a lower bound. 
Applying this technique to the travelling salesman problem, often rather good 
lower bounds are obtained. For example, the optimum value of the Lagrange 
function of the 52-city TSP of fig. 2.1 is equal to the length of the shortest tour. 


It is worth mentioning that the Lagrangian relaxation method gives a result 
about exact solutions too. 


Theorem 9.6. Let PCR" be a polytope and assume that every linear objective 
function c'x (c € Z") can be minimized over P in time polynomial in (c). Then for 
every matrix AE Z™", and vectors aE 2” and cE 2", the minimum 


min {c'x|x EP, Ax <a} 


can be computed in time polynomial in (A) + (a) + (c). 


In other words, adding a few constraints to a nice problem does not spoil it 
completely. While this result could also be derived by other means (e.g., by the 
ellipsoid method), the Lagrangian approach is computationally much better if the 
number m of additional constraints is small. 


10. Matrix methods 


Determinants and matchings. Let G be a graph with n nodes. In chapter 3, a 
randomized algorithm (cf. Edmonds 1967b, and Lovasz 1979) is described that 
decides if a graph G has a perfect matching. The method is based on the fact, 
proved by Tutte (1947) that det A(G, x) is identically 0 if and only if G has no 
perfect matching, where A(G, x) is the skew symmetric n X n matrix defined by 


Xi > ifijE E(G) andi <j, 
A(G, x);, =4-x,, ififEk(G)andi>;, 
0, otherwise . 


Generating random values for x,,, and computing the determinant, we obtain a 


cere esteem eee tet et A ONAL LT TA ACLS CC A a 


eae yacece pastes ern 


Combinatorial optimization 1589 


randomized matching algorithm. The following simple lemma, duc to Schwartz 
(1986), can be used to estimate the probability of error. | 


Lemma 10.1, Let f(x,,..-,x,) be a polynomial, not identically 0, in which each 
variable has degree at most k. Choose the x, independently from the uniform 
distribution on {0,1,...,N—1}. Then 


Prob(f(x,,.-- 4) =0) <2. 


Since an n Xn determinant can be evaluated in O(n”*’’') time (Coppersmith 
and Winograd 1982), this randomized algorithm has a better running time than 
the best deterministic one, whose time complexity is O(n*’’) (Even and Kariv 
1975). 

We recall two variants of the determinant-based matching algorithm from 
chapter 3. The algorithm above determines whether a given graph has a perfect 
matching; but it does not give a perfect matching. To actually find a perfect 
matching, we can delete edges until we get a graph G, with a perfect matching 
such that oe any further edge results in a graph with no perfect matching. 
Clearly, Gy is a perfect matching itself. 

Instead of this pedestrian procedure, Mulmuley et al. (1987) found an elegant 
randomized algorithm that finds a perfect matching at the cost of a single matrix 
inversion. The method is based on the following nice probabilistic lemma. 


Lemma 10.2, Let (E, 3) be a hypergraph and assign to each eG E a random 
weight w, from the uniform distribution over {1,... ,2\E|}. Then with probability 
at least +, the edge with minimum weight is unique. 


This lemma implies that if we substitute x, = 27" in A(G, x) (where each y,, is 
uniformly chosen from {0, . n?}) and then invert the resulting matrix A then, 
with probability at least 4, oe entries in the Schur product A~'e A having an 
odd numerator form a perfect matching. (The Schur product C= A°B of two 
n Xa matrices is defined by C= = A,B, ) 

A special value of this method is that it is parallelizable, using polynomially 
many processors and polylog time. This depends on Csanky’s theorem (see 
chapter 29, section 5) that gives an NC-algorithm for determinant computation 
and matrix inversion. Every known NC-algorithm for the perfect matching 
problem is randomized and uses determinant computation. 

Given a graph G = (V, £), an integer k, and F C E, the exact matching elie 
is to determine if there exists a perfect matching M in G such that [MN F| = 
This problem is not known to be in P, but it is easily solved in eee 
polynomial! time by the determinant method. Consider the matrix A(G, +), and 
substitute yz,, for x, if if @ F, where y is a new variable. This way we obtain a 
matrix A(G, z, y). Then Tutte’s theorem on determinants and matchings can be 
extended as follows. 
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Theorem 10.1. The coefficient of y* in the Pfaffian Pf(A(G, z, y)) is not identically 
0 in the variables z iff there exists a perfect matching M such that |MQ F|=k. 


This theorem suggests the following algorithm: Substitute random integers 
z,,€ {0,...,N—1} in A(G, y, z). The value of det A(G, y, z) is a polynomial in 
y, and all its coefficients can be computed in polynomial time. Compute 
Pf(A(G, y, Z)) = Vdet A(G, y, z), which is also a polynomial in y by definition. 
The coefficient of y“ gives the answer. (See chapter 3, section 7 for other 
applications of this idea.) 


Determinants and connectivity. The method of reducing a combinatorial optimi- 
zation problem to checking a polynomial (usually determinantal) identity and 
then solving this in randomized polynomial time via Schwartz’s lemma 10.1 is not 
restricted to matching theory. Chapter 36 contains examples where this method is 
used in electrical engineering and statics. The papers Linial et al. (1988) and 
Lovasz et al. (1989) contain various algorithms to determine the connectivity of a 
graph along these lines. Let us formulate one of these. Let G be a graph and 
consider, for each (unordered) pair ij with i=j or ij € E(G), a variable x,,. Let 
B(G, x) be the matrix 


X,». ifi=jorifEE(G), 
0, otherwise . 


B(G, x) = { 


Theorem 10.2. The graph G is k-connected iff 
no (n — k) X (n — k) subdeterminant of B(G, x) is identically 0. (*) 


This theorem suggests the following randomized k-connectivity test: substitute 
in B(G,x) independent random integers from, say, {0,...,2"}, and check 
condition (*). Unfortunately, there is no polynomial time algorithm known to 
check (*) for a general matrix; however, due to the very special structure of 
B(G, x), it suffices to check only a ‘‘few” subdeterminants. Let us select, for each 
vertex i, a set A, of k — | neighbors (if a node has degree less than k — 1 then the 
graph is clearly not k-connected). Then the following can be shown: if G is not 
k-connected, then one of the subdeterminants of B(x), obtained by deleting the 
rows belonging to some A, U {i} and the columns belong to some A; U {j}, is 
identically 0. 

This leads to the evaluation of O(n’) determinants of size (2 —k) x(n —k). 
With a little care, one can reduce this number to O(nk). For k <n/2, it is worth 
inverting the matrix B(G, x) and then check O(nk) subdeterminants of size k x k 
using Jacobi’s theorem (see chapter 31). 


Semidefinite optimization. Polyhedral combinatorics can be viewed as a theory of 
linear inequalities valid for the incidence vectors of various set-systems. It is quite 
natural to ask for quadratic inequalities (and, of course higher degree inequalities) 
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valid for these incidence vectors. This idea leads to real algebraic geometry and its 
study has just begun. 
At first sight it seems that we are getting 1 too much too easily. Let G = (V, E) be 


a graph, V= {1,...,}, and consider the following system of equations: 
x=x, foreverynodei€V , (21) 
xx,=0 for every edge if EE. (22) 


Trivially, the solutions of (21) are precisely the 0—1 vectors, and so the solutions 
of (21)-(22) are precisely the incidence vectors of stable sets. Unfortunately, 
there is little known about the solutions of systems of quadratic equations. In fact, 
this construction shows that even the solvability of such a simple system of 
quadratic equations (together with a linear equation )), x, = @) is NP-hard. 

However, we can use this system to derive some other constraints. (21) implies 
that for every node /, 


xy=e20, 1-x,=(1-x,) 20. (23) 
Using this, (22) implies that for every edge ij, 


1—x,-x,=1-x,-x, +x4,=(1-x)U—+,) 20. (24) 


i 


So we can derive the edge constraints from (21)~(22) formally. We can also 
derive the clique constraints. Assume that nodes 1,...,k induce a complete 
subgraph. We start with the trivial inequality 


(1-x,--+::-—x,)' <0. 
Expanding, we get 
1D “23, 42D xx, 2). 
inf 
Here the first sum is just ox, by (21) and the third sum is 0 by (22), so we get 
1-x,-+++ x, 20. (25) 


In the special case when the graph is perfect, we obtain all constraints for the 
stable set polytope STAB(G) in a single step. (STAB(G) is the convex hull of all 
incidence vectors of stable sets of G.) 

The algorithmic significance of this observation is that it leads to a polynomial 
time algorithm to compute the stability number of a perfect graph. By general 
consequences of the ellipsoid method (sce chapter 30), it suffices to design a 
polynomial time algorithm that checks whether a given inequality 


> Cx, =Y (26) 


is valid for STAB(G). By the above arguments, it suffices to check whether (26) 
can be derived from (21) and (22) as above. Formalizing, this leads to the 
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question: do there exist real multipliers p, (EV) and A, (jE E), and linear 
polynomials /,,...,/,, such that 


m © 
2 i + > BAX; —x;)+ ws A XX, =Y 7 2 C;X; - 
kl int jek int 


This is equivalent to saying that there exist A’s and y’s such that the & +1)x 
(n + 1) matrix P =(p,,) defined by 


y, ifi=j=0, 
(w,-—¢)/2, if 7=0,1>0, 
; (u,~¢)/2, iff =0,7>0, 
Py 


—p,;, ifi=j>0, 
A, /2, if jEE 
0, otherwise , 


is positive semidefinite. 
More generally, we can consider optimization problems of the type 


maximize >) CY; 
nies (27) 
subjectto P(y) is positive definite , 


where P(y) is a matrix in which every entry is a linear function of the y,. 
Grdtschel et al. (1981) describe a way, using the ellipsoid method, to solve such 
problems to arbitrary precision in polynomial time. (The key facts are that the 
feasible domain is convex and to check whether a given x satisfies the constraints 
can be done using Gaussian elimination; we ignore numerical difficulties here that 
are non-trivial.) The Duality theorem can also be extended to such semidefinite 
programs (sce Wolkowitz 1981). Recently Alizadeh (1992) showed that Karmar- 
kar’s interior point method can also be extended to such semidefinite programs in 
a very natural way, which is much more promising from a practical point of 
view. 

The method sketched here can be used to generate other classes of inequalities 
for STAB(G) and to show their polynomial time solvability. It is not restricted to 
the stable set problem cither; in fact, it can be applied to any 0-1 optimization 
problem. Interesting applications of a related method to the max-cut problem 
were given by Delorme and Poljak (1990) (see also Mohar and Poljak 
1990). 

Both of these semidefinite relaxations can be combined with randomized 
rounding methods (section 8). This results in interesting approximation algorithms 
for the max-cut problem (Goemans and Williamson 1994) and for chromatic 
number (Karger et al. 1994). 

These methods can be extended from quadratic to higher-order inequalities. 
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For these extensions, see Lovasz and Schrijver (1990) and Sherali and Adams 
(1990). 
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Computational complexity theory attempts to understand the power of computa- 
tion by providing insight into the question as to why certain computational prob- 
lems appear to be more difficult than others. Computation has added a dimension 
to the study of combinatorics. The theorem that, given a matching in a graph, there 
exists a larger matching if and only if there is an augmenting path, is not the com- — 
plete answer; is it possible to efficiently construct a larger matching if one exists? 
Although such an algorithm is known for the matching problem, this is not the 
case for many combinatorial problems. Indeed, the greatest challenge confronting 
complexity theory is to provide techniques to prove that no efficient algorithm 
exists for a given problem. 

Computational complexity theory provides the mathematical framework in 
which to discuss these questions, and while substantial progress has been made 
towards distinguishing the difficulty of computational problems, most of the basic 
issues remain unresolved. In this chapter, we will describe the fundamentals of 
this theory and give a brief survey of the results that have been obtained in its 
first quarter-century. For a more detailed and complete exposition, the reader is 
referred to the textbooks by Garey and Johnson (1979) and Hopcroft and Uliman 
(1979) as well as to the more recent Handbook of Theoretical Computer Science 
edited by van Leeuwen (1990). 


1. Complexity of computational problems 


In this section, we will outline the essential machinery uscd to give formal mean- 
ing to the complexity of computational problems. This involves describing what 
precisely is meant by a computational problem, setting up a mathematical model 
of computation, and then formalizing the notion of the computational resources 
required for a problem with respect to that model. Unfortunately, there is no one 
standardized specification under which to discuss these questions. For this theory 
to produce meaningful results, it is essential that the definitions be robust enough 
that theorems proved with respect to them apply equally to all reasonable variants 
of this framework. Indeed, the particular definitions that we will rely on will be 
accompanied by evidence that these notions are sufficiently flexible. 


1.1. Computational problems 


Computation can be thought of as finding a suitable output for a given input. 
Therefore, a computational problem is specified by a relation between inputs and 
output; an algorithm to solve the problem takes an acceptable input, called an 
instance, and computes an output that satisfies the input — output relation; for 
example, given a directed graph, output a hamiltonian circuit, if there is one, and 
otherwise indicate that none exists. This framework is very general, and we will 
focus attention on certain sorts of inputs and outputs. 

The most common type of input (or output) is a string of characters over a 
finite alphabet. The previous example can be cast in this setting, since it is easy to 
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represent a graph of order n as such a string of 0’s and 1’s; one might concatenate 
the rows of the 1 xn matrix A = (a;;) where a;; = 1 if and only if (i,j) is an arc. 
Alternatively, one might list the names of the nodes incident from node i, for 
i=1,...,”, where the node named j is encoded by the binary representation of j. 
Observe that the second representation will be more compact if the graph has few 
arcs, but this difference is limited, in that the length of the input in one format is 
at most the square of the length in the other. 

Some combinatorial structures cannot be compactly represented as a string; for 
example, a matroid on n elements is most naturally represented by a string of length 
2”, where cach character indicates whether a particular subset is independent. In 
such cases, it is customary to use an oracle to specify the input; the algorithm may 
write down queries, such as a particular subset to be tested for independence, and 
the oracle’s answer may be used in the next step of the algorithm. Sometimes, 
there is no natural finite representation of the input, such as in the problem of 
optimizing Over a convex body. For this example, one standard way to give the 
input is via an oracle that decides whether a given point is in the body, and if not, 
outputs a separating hyperplane. For any oracle, there is an associated parameter 
which is used as a measure of the size of the input. 

In analyzing algorithms, we often view numbers as atomic units, without regard 
for their lengths, and so an input can also be a list of numbers, where the size 
of the input is the number of elements in the list. However, for the remainder of 
this chapter we will focus on inputs that are strings, although it is straightforward 
to extend the discussion of computational models and complexity to include these 
alternatives. 

An important special case of a computational problem is a decision problem, 
where the output is restricted to either “yes” or “no”, and for each input, there is 
exactly one related output. The set L of “yes” instances for a decision problem is 
often called the language associated with this problem. A decision version of the 
previous example is: given a directed graph, does it contain a hamiltonian circuit? 
This type of problem will play a central role in our discussion, and it is important 
to realize that it is not a significant limitation to focus on it. For example, it is 
possible to answer the first search version of the hamiltonian circuit problem as 
follows: for each arc in the graph, delete the arc, decide if the resulting graph is still 
hamiltonian and replace the arc only in the case that the answer is “no”. At the 
end of this procedure, the remaining graph is a circuit. Thus, we have shown that 
finding a circuit is not much harder than the decision problem, and this relationship 
remains true for all well-formulated decision versions of search problems. , 

Another important type of computational problem is an optimization problem; 
for example, given a directed graph G and two nodes s and t, we may wish to 
find the shortest path from s to ¢ (or merely find the length). We shall in fact treat 
these as decision problems by adding a bound b to the instance, and, for example, 
asking whether G has a path from s to ¢ of length at most b. By using a binary 
search procedure that iteratively halves the range of possible optimal values, we 
see that an optimization problem can be solved with only somewhat more work 
than the corresponding decision problem. 
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Throughout this chapter, we will be dealing with computational problems in- 
volving a variety of structures, and we will not be specifying the nature of the 
encodings used. We will operate on the premise that any reasonable encoding 
produces strings of length that can be bounded by a polynomial of the length pro- 
duced by any other encoding. When encoding numbers (which we will assume to 
be integral) there is an important distinction between the binary representation, 
which we will typically use, and the unary representation (e.g., representing 5 as 
11111). Notice that the latter could be exponentially bigger than the former, and 
thus size is a deceptive measure, since it makes instances of the problem larger 
than they need be. As we survey the complexity of computational problems that 
involve numbers, we will see that some are sensitive to this choice of encodings, 
whereas others are less affected. 


1.2. Models of computation and computability 


We next turn our attention to defining a mathematical model of a computer. In fact, 
we will present three different models, and although their superficial characteristics 
make them appear quite different, they will turn out to be formally equivalent. The 
first of these is the Turing machine, which is an extremely primitive model, and as 
a result, it is easier to prove results about what cannot be computed within this 
model. On the other hand, its extreme simplicity makes it ill-suited for algorithm 
design. As a result, it will be convenient to have an alternative model, the random 
access machine (RAM), within which to discuss algorithms. 

The name “Turing machine” is a slight misnomer, since a Turing machine is a 
mathematical formulation of an algorithm, rather than a machine. A Turing ma- 
chine M = (Q,1°,8,q, A) is a machine that has a linite main memory represented 
by a finite set of states Q, a read-only input tape, a finite set of work tapes each of 
which contains a countably infinite number of cells (corresponding to the integers) 
to store a character from a finite alphabet J’, which at least includes the input 
alphabet {0,1} and a blank symbol B. For each of the & work tapes and the input 
tape, there is a “head” that can read one cell of the tape at a given time, and will 
be able to move cell-by-cell across the tape as the computation proceeds. Through- 
out the computation, the heads will read the contents of cells, and depending on 
what was read and the current state of the main memory, rewrite the cells and 
then move each head by one cell, either left or right, as well as cause a change in 
the state of the main memory. The basic step of a Turing machine, a transition, is 
modeled by a partial function 


6:0xT 40x x {LR}! 


that selects the new state, the contents of the cells currently scanned on the work 
tapes, and indicates the direction in which each head moves one cell as a deter- 
muinistic function of the current state and the contents of the input- and work-tape 
cells currently being read. One may view the transition function as the program 
hardwired into this primitive machine. The computation is begun in state qo, with 
the input head at the left end of the input, and all of the work tapes and the rest 
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of the input tape blank. The machine halts if 6 is undefined for the current state 
and the symbols read. An input is accepted if it halts in a state in the accepting 
state set A C Q. A Turing machine M solves a decision problem L if L is the set 
of inputs accepted by M, and M halts on every input; such a language L is said 
to be decidable. This definition of a Turing machine is similar to the one given 
by Turing (1936), and the reader should note that many equivalent definitions are 
possible. 

We have defined a Turing machine so that it can only solve decision problems, 
but this definition can be easily extended to arbitrary computational problems by, 
for example, adding a write-only output tape, on which to print the output before 
halting. Although this appears to be a very primitive form of a computer, it has 
become routine to accept the following proposition. 


Church’s thesis: Any function computed by an effective procedure can be computed 
by a Turing machine. 


Although this is a thesis, in the sense that any attempt to characterize the inex- 
plicit notion of effective procedure would destroy its intent, it is supported by a 
host of theorems, since for any known characterization of computable functions, it 
has been shown that these are Turing computable. 

A random access machine (RAM) is a model of computation that is well-suited to 
specifying algorithms, since it uses an idealized, simplified programming language 
that closely resembles the assembly language of any modern-day digital computer. 
There is an infinite number of memory cells indexed by the integers, and there 
is no bound on the size of an integer that can be stored in any cell. A program 
can directly specify a cell to be read from or written in, without moving a head 
into position. Furthermore, there is an indirect addressing option, which uses the 
contents of a cell as an index to another cell that is then (seemingly randomly) 
accessed. All basic arithmetic operations can be performed. For further details on 
RAM’s the reader is referred to Aho, Hopcroft and Ullman (1974). 

Another model of computation closely tied to a practical setting is the logical 
circuit model. The simplicity of the circuit model makes it extremely attractive 
for proving lower bounds on the computational resources needed for particular 
functions, and research along these lines will be discussed in depth in chapter 
40. A circuit may be thought of as a directed acyclic graph, where the nodes of 
indegree 0 are the Boolean input gates (which can assume value 0 or 1), and the 
remainder correspond to functional gates of the circuit and are labeled with an 
operation, such as the logical or, negation, or logical and operations. The nodes 
of outdegree 0 are the outputs. A given circuit only handles inputs of a particular 
size, and so we specify a circuit for each input length. We say that the family of 
circuits solves a computational problem if for each input the corresponding circuit 
in the family generates a related output. Note that this model differs from the 
previous two in that the circuit for inputs of length n can be tailored nonuniformly 
to the particular value of n, whereas a Turing machine or RAM must run on 
inputs uniformly, independent of their length. We can make the notion of a family 
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of circuits equivalent to the other models of computation by insisting that there 
be a Turing machine that, on input n, computes the description of the circuit for 
inputs of length n. 

In spite of the apparent differences in these three models, any particular choice 
is one of convenience, and not of substance. 


Theorem L.1. The following classes of problems are identical: 

e the class of computational problems solvable by a Turing machine; 

e the class of computational problems solvable by a RAM; 

e the class of computational problems solvable by a family of circuits that can be 
generated by a Turing machine. 


Theorem !.1 is complemented by the following theorem, which is a consequence 
of the fact that there are an uncountable number of decision problems, but only a 
countable number of Turing machines. 


Theorem 1.2. Not all decision problems are solvable by a Turing machine. 


The understanding of this inherent limitation on the power of computation was 
an outgrowth of results in mathematical logic. In particular, the first incompleteness 
theorem of Gédel (1931) contained the first sort of undecidability result and pro- 
vided many of the essential ideas that would be used by Church, Post and Turing 
in their groundbreaking work on the nature of computation. In particular, Turing 
(1936) proposed what we call a Turing machine, and this enabled the discussion 
to be conveniently directed towards computation. 

At the core of all of these results is Gédel’s notion of encoding theorems as 
strings in some uniform way. Analogously, a Turing machine can be encoded as a 
string by first specifying the number of states that the machine has, followed by 
a list of all of the allowed transitions. Each such string can also be interpreted as 
an integer by using a binary encoding. Thus, each integer i represents a Turing 
machine M;. Turing showed that the language Lip = {i | M; halts on input i} is not 
solvable by a Turing machine. His proof that this halting problem is undecidable 
uses the following diagonalization argument. Suppose that Li, were solvable by 
a Turing machine M. Build another machine M’ that first uses M to decide if 
the input i € Lyp; if it is, then M’ enters an infinite loop, and if not, M’ halts. Of 
course, M’ must be M, for some integer k. But M’ and M, cannot accept the same 
language, since M’' halts on k if M, does not, and vice versa. 

A problem 1, (many-one) reduces to a problem L» if there exists a function f 
computable by a Turing machine such that x € L, if and only if f(x) € L2. Note 
that if L, is undecidable and L, reduces to Lz, then Ly must also be undecidable. 
This provides a strategy for proving additional undecidable problems. As a simple 
example, consider the language L, = {i | M; accepts on an empty input}. It is a 
simple exercise to convert the description of a given Turing machine M; to the 
description of another (rather trivial) machine M’ = M, that on every input, first 
runs M; with input i and accepts if M; halts on i, Clearly, M; accepts an empty 
input if and only if M; halts on i. 
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A similar strategy can be used to prove Gédel’s undecidability theorem in the 
context of Turing computability, although the details of the reduction are more 
involved. The theory of arithmetic for non-negative integers with addition and 
multiplication can be defined as follows. Consider first-order formulas that can be 
constructed from variables and the constants 0 and 1 with the logical connectives 

4 V,A, >, a, and V, along with the operations - and +, and the binary relations 
= and <. A sentence is a formula in which all of the variables are bound. We 
consider the standard model of number theory (as defined by the Peano axioms). 
A sentence is provable if it can be deduced from these axioms. The theory of 
arithmetic L, = (Z,,+,-,=,<,0,1) is the collection of provable sentences. A rela- 
tively straightforward construction shows that L, reduces to La, which yields the 
following fundamental result. 


Theorem 1.3. L, is undecidable. 


This theorem implies the incompleteness of this model of number theory, i.e., 
there are sentences such that neither it nor its negation is provable. Every complete 
model is decidable, since a Turing machine can generate all possible deductions 
and stop if either the statement or its negation is proved. 

These results were only the first steps in a rich area of research that can be 
viewed as the ancestor of modern-day complexity theory. One of its pinnacles of 
achievement is the solution of Hilbert’s tenth problem, which asked for a procedure 
to decide if a given multivariate polynomial has an integer-valued root. Culminating 
ycars of progress in this arca, Matijasevit proved that this problem is undecidabic. 
|For an introduction to this history and the many results on which this proof builds, 
the reader is referred to the survey of Davis (1977).} 

An important generalization of a Turing machine that will play a fundamental 
role in complexity theory is that of a nondeterministic Turing machine. Here, the 
transition function 6 is no longer a function, but rather a relation, in that at each 
step there is a finite set of possible next moves of which exactly one is made. 
The notion of acceptance by a nondeterministic Turing machine is central to its 
definition: an input is accepted if there exists a sequence of transitions of 6 that 
cause the machine to halt in an accepting state. A nondeterministic Turing machine 
M solves a decision problem L if L is the set of inputs accepted by M, and for 
every input M halts on each sequence of transitions. An equivalent formulation is 
to think of a nondeterministic Turing machine as a deterministic Turing machine 
with an additional guess tape, that is a read-only tape, where the head only moves 
to the right. The contents of the guess tape are magically constructed and presented 
to the machine as it begins the computation. For simplicity, we shall assume that 
the machine makes the same number of transitions for any guess. The set of inputs 
in L is the set of inputs for which there is a guess that allows the machine to halt in 
an accepting state. Note that a nondeterministic Turing machine accepting L can 
be converted into a deterministic machine for L by trying all of the (exponentially 
many) guesses, We can view a particular computation as proving (or disproving) 
the theorem “x € L”; in this context, the difference between determinism and 
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nondeterminism is analogous to the difference between proving a theorem and 
verifying its proof. 

Consider again the hamiltonian circuit snablehn: A nondeterministic Turing ma- 
chine for it might be constructed by letting the guess tape encode a sequence of 1 
nodes of the graph. The Turing machine simply verifies that there is an arc between 
each consecutive pair of nodes in the guessed scquence, as well as between the 
first and last nodes. If the graph is hamiltonian, then there is a correct guess, but 
otherwise, for any sequence of 1 nodes there will be some pair that is not adjacent. 
The correct guess, in essence the hamiltonian circuit, is a certificate that the graph 
is hamiltonian. Observe that this definition is one-sided, since the requirements for 
instances in L and not in L are quite different. 


1.3. Computational resources and complexity classes 


Now that we have mathematical formulations of both problems and machines, we 
can describé what is meant by the computational resources required to solve a 
certain problem. 

In considering the execution of a deterministic Turing machine, it is clear that 
the number of transitions before halting corresponds to the running time of the 
machine on a particular input. In discussing the running time of an algorithm, 
within any of the models, we do not want to simply speak of particular instances, 
and so we must make some characterization of the running time for all instances. 
The criterion that we will focus on for most of this chapter is the worst-case running 
time as a function of the input size. When we say that a Turing machine takes n? 
steps, this means that for any input of size n, the Turing machine always halts 
within n? transitions. Unless otherwise specified, we will let n = |x| denote the 
length of the input string x. 

We will not be interested in the precise count of the number of transitions, but 
rather in the order of the running time. A function f(m) is O(g(n)) if there are 
constants N and c such that for all n > N, f(n) < cg(n). Thus, rather than say that 
a Turing machine has worst-case running time 3n? + Sn — 17, we say simply that it 
is O(n2). This simplification makes it possible to discuss complicated algorithms 
without being overwhelmed by details. Furthermore, for any Turing machine with 
superlinear running time and any constant c, there exists another Turing machine 
to solve the same problem that runs c times faster than the original, that can be 
constructed by using an expanded work-tape alphabet. 

We will also use a notation analogous to O(-) to indicate lower bounds. A 
function f(n) is Q(g(n)) if there are constants N and c such that for all n >N, 
f(n) > cg(n). A function f(n) is O(g(n)) if it is both O(g(m)) and O(g(n)). 

For a nondeterministic Turing machine, we define the running timc to be the 
number of transitions in any computation path generated by the input. (Recall that 
we added the restriction that all computation paths must have the same length.) 
For a circuit, we will want to characterize the number of operations performed as 
a measure of time, so that the relevant parameter is the size of the circuit, which 
is the order of the graph representing it. For a RAM, there are two standard 
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ways in which to count the running time on a particular input. In each operation, 
a RAM can, for example, add two numbers of unbounded length. In the unit- 
cost model, this action takes one step, independent of the lengths of the numbers 
being added. In the log-cost model, this operation takes time proportional to the 
lengths of the numbers added. These measures can have radically different values. 
Take 17, and repeatedly square it A times. With each squaring, the number of bits 
in the binary representation essentially doubles. Thus, although we have taken 
only k steps in the unit-cost model, the time required according to the log-cost 
model is exponential in k. This may seem artificial, but this problem can occur, for 
example, in Gaussian elimination, if it is not implemented carefully. Technically, 
we will use the log-cost model, in order to ensure that the RAM is equivalent to 
the Turing machine in the desired ways. But, when speaking of the running time 
of an algorithm, it is traditional to state the running time in the unit-cost model, 
since for all standard algorithms one can prove that the pathological behavior of 
the above example docs not come into play. 

Time is not the only computational resource in which we will be interested; we 
will see that the space complexity of certain combinatorial problems gives more 
insight into the structure of these problems. In considering the space requirements, 
we will focus on the Turing machine model, and will only count the number of cells 
used on the work tapes of the machine. Furthermore, we will again be interested in 
the asymptotic worst-case analysis of the space used. As was true for time bounds, 
the space used by a Turing machine can be compressed by any constant factor. The 
space used by a nondeterministic Turing machine is the maximum space used on 
any computation path. For circuits, it will also be interesting to study their depth, 
the longest path from a node of indegree 0 to one of outdegree 0. 

The notion of the complexity of a problem is the order of a given computational 
resource, such as time, that is necessary and sufficient to solve the problem. Con- 
sider the following directed reachability problem: given a directed graph G, and 
two specified nodes s and ¢, does there exist a path from s to 1? When we say 
that the complexity of the directed reachability problem for a graph with m arcs 
is @(m), this means that there is a (unit-cost RAM) algorithm that has worst-case 
running time O(m) and there is no algorithm with running time of lower order. 
Tight results of this kind are extremely rare, since the tremendous progress in 
the design of efficient algorithms has not been matched, or even approached, by 
the slow progress in techniques for proving lower bounds on the complexity of 
these problems in general models of computation. For example, consider the 3- 
colorability problem: given an undirected graph G with m edges, can the nodes 
be colored with three colors so that no two adjacent nodes are given the same 
color; i.e., is y(G) <3? The best lower bound is only Q(i), in spite of substantial 
evidence that it cannot be solved in time bounded by a polynomial. 

In order to study the relative power of particular computational resources, we 
introduce the notion of a complexity class, which is the set of problems that have 
a specified upper bound on their complexity. It will be convenient to define the 
complexity class DTIME(T(#)) to be the set of all languages L that can be rec- 
ognized by a deterministic Turing machine within time O(7(”)). NTIME(T(7)) 


I ee 
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denotes the analogous class of languages for nondeterministic Turing machines. 
Throughout this chapter, it will be convenient to make certain assumptions about 
the sorts of time bounds that define complexity classes. A function T(m) is called 
fully time-constructible if there exists a Turing machine that halts after exactly T(n) | 
steps on any input of length n. All common time bounds, such as nlogn or n?, are 
fully time-constructible. We will implicitly assume that any function 7 (7) used to 
define a time-complexity class is fully time-constructible. 

The single most important complexity class is P, the class of decision problems 
solvable in polynomial time. Two of the best-known algorithms, the Euclidean 
algorithm for finding the greatest common divisor of two integers and Gaus- 
sian elimination for solving a system of linear equations, are classical examples 
of polynomial-time algorithms. In fact, Lamé observed as early as 1844 that the 
Euclidean algorithm was a polynomial-time algorithm. In 1953, von Neumann con- 
trasted the running time for an algorithm for the assignment problem that “turn[ed| 
out [to be] a moderate power of 7, i.c., considerably smaller than the ‘obvious’ 
estimate n!” for a complete enumeration of the solutions. Edmonds (1965) and 
Cobham (1965) were the first to introduce as an important complexity class, 
and it was through the pioneering work of Edmonds that polynomial solvability 
became recognized as a theoretical model of efficiency. With only a few excep- 
tions, the discovery of a polynomial-time algorithm has proved to be an important 
first step in the direction of finding truly efficient algorithms. Polynomial time has 
proved to be very fruitful as a theoretical model of efficiency both in yielding a 
deep and interesting theory of algorithms and in designing efficient algorithms. 

There has been substantial work over the last 25 years in finding polynomial- 
time algorithms for combinatorial problems. It is a testament to the importance 
of this development that much of this Handbook is devoted to discussing these 
algorithms. This work includes algorithms for graph connectivity and network flow 
(see chapter 2), for graph matchings (see chapter 3), for matroid problems (see 
chapter 11), for point lattice problems (see chapter 19), for testing isomorphism 
(see chapter 27), for finding disjoint paths in graphs (see chapter 5) as well as for 
problems connected with linear programming (see chapters 28 and 30). 

Another, more technical reason for the acceptance of ¥ as the theoretical no- 
tion of efficiency, is its mathematical robustness. Recall the discussion of encodings 
where we remarked that any reasonable encoding will have length bounded by a 
polynomial in the length of another. As a result, any polynomial-time algorithm 
which expects its input in one form can be converted to a polynomial-time al- 
gorithm for the other. In particular, note that the previous discussion of the two 
different encodings of a graph can be swept aside and we can assume that the size 
of the input for a graph of order 7 is n. Notice further that the informal definition 
of # given above does not rely on any model of (deterministic) computation. One 
justification for this statement is the following theorem. 


Theorem 1.4. The following classes of problems are identical: 


e the class of computational problems solvable by a Turing machine in polynomial- 
time; 
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e the class of computational problems solvable by a RAM in polynomial-time under 
the log-cost measure; 

e the class of computational problems solvable by a family of circuits of polynomial 
size, where the circuit for inputs of size n can be generated by a Turing machine 
with running time bounded by a polynomial in n. 


The importance of the class WY = LJ, NTIME(n*) is due to the wide range of 
important problems that are known to be in the class, and yet are not known to 
lie in ?. For example, the nondeterministic algorithm given for the hamiltonian 
circuit problem is clearly polynomial, and so this problem lies in WY. However, it 
is not known whether this or any problem is in VY \ Y, and this is, undoubtably 
the central question in complexity theory. 


Open Problem. Is ? = NP? 


The following reformulation of NY is often useful: L € NY if there exists a 
language L’ € Y and a polynomial p(n) such that x € L <> dy such that |y| = p(|x|) 
and (x,y) € L’. (We will denote this polynomially bounded quantification by 3).) 

For each decision problem L, there is a complementary problem, L, such as 
the problem of recognizing non-hamiltonian graphs. For any complexity class 5s, 
let co-s denote the class of languages whose complement is in s. The definition 
of # is symmetric with respect to membership and non-membership in L, so that 
P =co-F. In this respect VY is very different. In fact, it is unknown whether the 
hamiltonian circuit problem is in co-NY. 


Open Problem. Is WP? = co-NP? 


Edmonds (1965) brought attention to the class NY Nco-N*, and called prob- 
lems in this class well-characterized, since there is a short certificate to show that 
the property holds, as well as a short certificate that it does not. Edmonds was 
working on algorithms for non-bipartite maximum matching at the time, and this 
problem serves as a good example of a problem in this class. If the instance consists 
of a graph G and a bound & and we wish to know if there is a matching of size at 
least k, the matching itself serves as a certificate for an instance in L, whereas an 
odd-set cover serves as a certificate for an instance not in L (see chapter 3). Note 
that there is a min-max theorem characterizing the size of the maximum matching 
that is at the core of the fact that matching is in NP Nco-NY, and indeed min— 
max theorems often serve this role. As mentioned above, matching is known to 
be in Y, and this raises the following question. 


Open Problem. Is 2 = NY Nco-NY? 


We will also be concerned with complexity classes defined by the space com- 
plexity of problems. As for time, let DSPACE(S(n)) and NSPACE(S()) denote, 
respectively, the class of languages accepted by deterministic and nondeterministic 
Turing machines within space O(S(n)). We will implicitly assume the following 
condition for all space bounds used to define complexity classes: a function S(m) is 
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fully space-constructible if there is a Turing machine that, on any input of length n 
delimits S() tape cells and then halts. Three space complexity classes will receive 
the most prominent attention: 
e £=DSPACE(log x); 
e NL = NSPACE(logn); and 
© PSPACE = |J,DSPACE(n*). - 

One might be tempted to add a fourth class, WPSPACE, but we shall see that 


nondeterminism does not add anything in this case. We will see that the chain of 
inclusions 


£LONL CP CNP C PSPACE 


holds, and the main thrust of complexity theory is to understand which of these 
inclusions is proper. At the extremes, a straightforward diagonalization argument 
due to Hartmanis, Lewis and Stearns shows that c 4 PSPACcé, and a result of Savitch 
further implies that Wc 4 PSPACE, but after nearly a quarter century’s more work, 
these are the only sets in this chain known to be distinct. 


1.4. Randomized computation 


In this subsection, we will consider models of computation that exploit the power of 
randomization in the design and analysis of algorithms without making probabilistic 
assumptions about the inputs. A randomized algorithm is an algorithm that can 
flip coins during the computation; that is, we consider a fixed input, and study the 
algorithm’s behavior as a random variable depending only on the coin flips used. 
For the algorithms discussed below, the algorithm is allowed to make mistakes, 
but for each input the probability of error must be very small. 

The best-known randomized algorithm is for testing primality. In the primality 
testing problem, we are given a natural number N, and we wish to decide if it is 
prime. It is not known whether primality testing is in 2. The input size of the 
number N is log N, and therefore algorithms that simplistically search for divisors 
of N do not run in polynomial time. Consider Fermat’s theorem: if N is prime then 
a’ is congruent to a modulo N for every integer a. This provides a way to conclude 
that a number is not prime without actually exhibiting a factor. That is, if we find 
an integer a such that a is not congruent to a modulo N, denoted a’ #4 a mod N, 
then we can conclude that N is not prime. (Note that a’ mod N can be computed 
in polynomial time by repeatedly squaring modulo N.) Let such an a be called a 
witness for N’s compositeness. The advantage of this kind of witness, compared 
to exhibiting a divisor, is that if there exists an integer a such that a” 4 a mod N, 
then at least half of the integers in the range from 1 to N have this property. 

Unfortunately, there are composite numbers, the so-called Carmichael numbers 
that are not prime, but for which no witness exists. If we momentarily forget about 
the existence of these numbers, we get the following algorithm for testing primality: 
given an integer N, choose an integer a in the range 1 to N at random, and check 
if a is a witness for N. If a witness is found, then we know that N is not prime 
(and not even a Carmichael number). On the other hand, if N is not a prime (and 
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also not a Carmichael number), then a random a is a witness with probability at 
least one half. Running this test k times with independent random choices, we 
either find a witness or can fairly safely conclude that no witness exists (with error 
probability 2~*). This gives a randomized polynomial-time algorithm to recognize 
the language of all primes and Carmichael numbers. Rabin (1976) and separately 
Solovay and Strassen (1977), by using a somewhat more sophisticated variant of 
Fermat’s theorem, gave randomized polynomial-time algorithms that accept the 
language of all primes. 

The above idea actually gives a polynomial-time algorithm if the extended Rie- 
mann hypothesis ts true. Miller has proved that if the extended Riemann hypothesis 
holds, then there exists a witness for N (in more or less the above sense) that is 
at most O((log N)*). By trying all the integers up to this limit, we would get a 
polynomial-time deterministic algorithm for primality testing. For more details on 
this and other number-theoretic algorithms see the survey of Lenstra and Lenstra 
(1990). 

The formal definition of a randomized Turing machine is similar to the defini- 
tion of a nondeterministic Turing machine in the sense that at every point during 
the computation there could be several different next steps. Randomized Turing 
machines have a read-only randomizing tape similar to the guess tape of the non- 
deterministic Turing machine. We can think of this tape as providing the outcomes 
of the coin flips to be used by the algorithm. We shall assume that for a given input 
length n, the algorithm reads a fixed number of bits, f(n), from the randomizing 
tape. The probability that the randomized Turing machine accepts an input x of 
length n is defined to be the fraction of all possibie strings of length f() that, when 
used as the initial segment of the randomizing tape, cause the Turing machine to 
accept x. For the randomized Turing machine, we take the simplifying approach 
that the running time for an input is the maximum number of transitions in some 
sequence of allowed transitions (i.c., for some contents of the randomizing tape). 
Unlike the nondeterministic Turing machine, a randomized Turing machine is not 
just a convenient mathematical model; randomized algorithms can be implemented 
in practica) settings. 

We define BPP, the class of languages accepted by a randomized polynomial- 
time algorithm, as follows. A language L is in 8p? if there exists a polynomial-time 
randomized Turing machine that accepts each x € L with probability at least 3; and 
rejects each x ¢ L with probability at least 5. We can think of the outcomes of the 
computation as follows: if the Turing machine accepts x this means that “x is 
probably in L”, whereas if it rejects, that means that “x is probably not in L”. 
Note that the choice of the number ¢ in the definition was rather arbitrary: if we 
run the algorithm & times independently, and take the majority decision, we can 
decrease the probability of error exponentially in k. If k is fairly large, then one 
can accept the answer given by the algorithm without any reasonable shadow of 
doubt. (The letters 8P? stand for a Probabilistic Polynomial-time algorithm with 
probabilities Bounded away from }.) 


Other problems not known to be in ¥ for which there is a randomized 
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polynomial-time algorithm include Computing the square root of an integer x mod- 
ulo a prime p, and deciding whether the determinant of a matrix whose entries are 
multivariate polynomials is the zero polynomial. The algorithm for the latter prob- 
lem assigns random values to the variables in the polynomials, and computes the 
resulting determinant (of numbers). If this determinant is non-zero, then certainly 
the determinant of variables is non-zero as a polynomial. On the other hand, if the 
determinant of variables is non-zero, then a random evaluation will yield a non- 
zero determinant with very high probability. This can be used to show that given 
a graph G, a subset of edges F and an integer k, determining whether a graph 
has a perfect matching with exactly k edges in F can be solved by a randomized 
polynomial-time algorithm (see chapter 3). 

Notice that the randomized primality testing algorithm has a property stronger 
than required by the formal definition. The conclusion that N is not a prime was 
certain; uncertainty arose only in the case of the conclusion “N is probably prime”. 
The complexity class RP is defined to reflect this asymmetry. A language L is in 
RP if there exists a polynomial-time randomized Turing machine RM such that 
each input that RM can accept (along any computation path) is in L and for each 
input x € L, the probability that RM accepts x is at least i Note again that the 
choice of the number $ is arbitrary. 

The above mentioned algorithms show that primality testing is in co-RP. There 
are very few problems known to be in 8?? but not in RP or co-RP, Bach, Miller 
and Shallit provided the first “natural” examples of problems in gp?P that are not 
obviously in RP or co-RP. They proved that the set of perfect numbers is in BPP. 
(A natural number N is perfect if the sum of all its natural divisors is 2N; for 
example, 6 is perfect.) 

In some sense, an RP algorithm is more satisfying than a BPP algorithm, since at 
least one of the two conclusions reached can be claimed with certainty. A random- 
ized algorithm that never makes mistakes would be even more desirable. This can 
be defined in the following way: an algorithm to compute a function f(x) is a Las 
Vegas algorithm if, given an input x, this randomized algorithm either correctly 
computes f(x), or it halts without coming to a conclusion, and the probability of 
the latter outcome is less than } for cach input x. [tis casy to see that a language 
L is in RPM co-RP if and only if there exists a polynomial-time Las Vegas algo- 
rithm to decide membership in L. Observe also that if we repeat any Las Vegas 
algorithm until it gives an answer, the resulting algorithm always gives the correct. 
answer, and for any input, it is expected to run in polynomial time. Extending 
results of Goldwasser and Kilian, Adleman and Huang give a sophisticated Las 
Vegas algorithm for testing primality. 

There is evidence that suggests that randomized polynomial time is not that dif- 
ferent from Y. For example, consider a nonuniform analog of ? by considering the 
class of languages accepted by a family of polynomial-size circuits. Alternatively, 
one can make the Turing machine model nonuniform, by allowing the machine 
free access to a prespecified polynomial-length advice string s,,, when processing 
any input of length n. This class is denoted ¥/poly. For any language L in BPP, 
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we can assume without loss of generality that there is a machine RM that accepts 
each x € L. with probability | --2 ('') and rejects cach x ¢ L with probability 
1—2-*) where 1 denotes the length of x (since taking the majority decision of 
repeated trials decreases the error probability exponentially). However, this im- 
plies that the probability that a random string used by RM for an input of length 
n would work correctly for ail strings of length 7 is at least 5. Thus, there must 
exist such a good string to serve as the advice, and we have shown the following 
theorem, which is based on an idea of Adleman. 


Theorem 1.5. sPP C P/poly. 


2. Shades of intractability 


In this section we will consider many computational problems, and sec that the 
universe does not appear to be divided simply into tractable and intractable prob- 
lems. Current evidence suggests that there are a variety of different classes of 
problems, each characterizing its own particular shade of intractability. Much of 
the work in complexity theory is aimed at understanding the correct framework in 
which to place these problems. 

The subsections here reflect three types of approaches for characterizing the 
difficulty of these problems. The nicest sort of result places absolute limits on our 
ability to solve problems; for example, the most severe limit is to show that a 
problem is undecidable, and among decidable problems there are only a handful 
that can be proven intractable, in the sense that they require a certain (non-trivial) 
amount of time or space to be solved. Much more common is to provide a com- 
pleteness result to show that a particular problem is a hardest problem within a 
given complexity class. If the class contains a great number of problems not known 
to be solvable with more modest resources, this provides evidence that the problem 
is intractable. Finally, in order to better understand a problem, it has frequently 
been useful to strengthen the basic Turing machine model in order define complex- 
ity classes that better characterize the problem. Such an alternative view has often. 
made problems appear less intractable; the subsections on the polynomial-time 
hierarchy and on randomized proofs present results in this direction. 


2.1. Evidence of intractability: N P-completeness 


The lack of lower bounds with respect to a general model of computation for either 
space or time complexity has led to the search for other evidence that suggests 
that lower bounds hold. One such type of evidence might be the implication: 
if the hamiltonian circuit problem is in #, then AP = NY. Now, NF contains 
a tremendous variety of problems that are not known to be in #, and so by 
proving such a claim, one shows that the hamiltonian circuit problem is a hardest 


problem in VY, in that any polynomial-time algorithm to solve it would in fact 
solve thousands of other problems. 
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The principal tool in providing evidence of this form is that of reduction. We will 
say that a problem L, polynomial-time reduces to Lz if there exists a polynomial- 
time computable function f that: maps instances of L, into instances of L2 such 
that x is a “yes” instance of L, if and only if f(x) is a “yes” of Lz. We shall 
denote this by L1xp)L2. Notice that if there were a polynomial-time algorithm for 
LL, we could then obtain a polynomial-time algorithm for L, by first computing 
f(x) and then running the assumed algorithm on f(x). This composite procedure is 
polynomial-time, since the composition of two polynomials is itself a polynomial. 


Definition. A problem L, is V P-complete if 
(i) Ly ENP; 
(ii) for all LE NP, LxpLy. 


The composition argument given above yields the following results. 


Theorem 2.1. For any NP-complete problem L, L_€ FP if and only if P =NF. 


Theorem 2.2. If L, is NP-complete, LyE NP and LixpLz, then Lz is NP- 
complete. 


The first result says that any P-complete problem completely characterizes 
N® in its relationship to #, and that we can focus on any such problem without loss 
of generality in trying to prove that the two classes are different. The hamiltonian 
circuit problem is NP-complete, and in this section we will prove that several 
combinatorial problems are among the plethora having this property. The second 
result gives a strategy for proving that a problem is WY-complete, provided that 
a “first” WP-complete problem is known. Of course, to initiate this strategy, we 
must show that some natural problem has this property, and it was a landmark 
achievement in complexity theory when Cook (1971) showed that the problem 
of deciding the satisfiability of a formula in propositional logic is WA-complete. 
The importance of this result was only fully recognized through the work of Karp 
(1972), whose seminal paper showed 21 well-known combinatorial problems to 
be WP-complete. Independently, Levin (1973) discovered the same approach to 
studying the intractability of computational problems. 

A Boolean formula in conjunctive normal form is the conjunction (and) of 
clauses C;,...,C;, each of which is a disjunction (or) of literals x,,%),...,27,X), 
where each x; is a Boolean variable and x; denotes its negation. In the satisfiability 
problem (SAT) we are given such a Boolean formula, and asked to decide if there 
exists a truth assignment for the variables such that the formula evaluates to true. 


Theorem 2.3. The satisfiability problem is NP-complete. 


Proof. [t is easy to see that the satisfiability problem is in VY, since we can 
interpret the first ¢ cells of the guess tape as providing the. assignment, and then it 
is a simple matter to evaluate the formula for that assignment in polynomial time. 

Next, we must show that for any language L € NY there exists a polynomial- 
time function f that maps each instance x of the original problem into a Boolean 
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formula such that x € L if and only if f(x) is satisfiable. Before giving the reduction, 
we first argue that M may be assumed to have a form somewhat simpler than the 
original definition. Imagine that the Turing machine has only one tape, which serves 
both as the input tape and as the work tapes. A simple construction shows that 
if there exists a nondeterministic Turing machine M with running time T(n), then 
there cxists a nondeterministic machine of this simpler form that finishes within 
time T?(n) (by enlarging the alphabet so that each symbol denotes one character 
on all of the tapes of M). Furthermore, we can assume without loss of generality 
that for some / the machine M runs for exactly time n! on every input of length n. 

Let M be such a simplified machine that runs on input x for T steps before 
halting. We can describe the configuration of the machine at any instant of the 
computation by giving the contents of the tape, the position of the head and the 
current state. This can all be encoded as a string by using the alphabet J’ = 
Ix (QU {!}) where the first coordinate gives the contents of a cell of the tape, 
and the second is ‘!’ unless the head is reading that cell of the tape, when it is 
the current state. We can then encode the entire computation as a matrix, where 
each row of the matrix corresponds to one step of the computation, and each 
column corresponds to a cell of the tape. (Note that there are no more than T 
cells in either direction of the initial head position that could be reached during 
the computation.) Acceptance of x by M boils down to the following question: 
does there exist a guess that causes the matrix to be filled in so that in the last 
row, the configuration contains an accepting state? 

It is straightforward to construct a Boolean formula that represents this question. 
Let gi,...,g7 be variables that represent the binary values of the guess tape. Let 
aij, Tepresent the contents of the ijth cell of the matrix in the sense that it is 1 if 
and only if it contains the kth character of I’. The formula will be a conjunction 
of pieces that correspond to the following conditions that we wish to impose: the 
variables represent some matrix, in that exactly one character is stored in each 
entry; the first row corresponds to the initial configuration for input x; the last 
row contains an accepting state; the computation proceeds in the deterministic 
way specified by the guesses g;. We will not give each of these in detail, but only 
sketch the main ideas. The first is easy: for cach of the O(T7) entries, check that 
at least one of the associated variables is 1 (by their or) and for each pair, check 
that not both are 1 (by the or of their negations). The second and third are equally 
routine. The last condition takes a bit more work and is based on a principle of 
locality: if locally the computation (i.e., the matrix) appears correct, then the entire 
computation was performed correctly. In fact, it is sufficient to check that each 2 x 2 
submatrix appears correct. Furthermore, it is an easy exercise to encode that some 
2 x 2 submatrix in the ith and (i + 1)th rows behaves according to the guess g;. By 
taking the conjunction of all O(77) such pieces of local information, we enforce 
that the computation is done correctly. [t is now routine to verify that the formula 
constructed is satisfiable if and only if there are guesses that lead M to accept x. 


ia) 


Now that we have our initial YW P-complete problem, we proceed to give a num- 
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ber of reductions to show that several important combinatorial problems are VP- 
complete. Literally thousands of problems are now known to be V¥-complete, 
so we will only present a small handful of examples that serve to illustrate an 
important phenomenon in complexity theory, or relate to important combinatorial 
problems discussed elsewhere in this chapter, as well as in the rest of this volume. 
Most of the problems that we consider were shown to be NP-complete in the 
pioneering work of Karp (1972). 

Many restricted cases of the satisfiability problem are also “P-complete. One 
that is often used in further A” P-completeness proofs.is the 3-SAT problem, where 
cach clause of the conjunctive normal form must contain exactly three literals. It 
is a simple task to show that by adding additional variables, longer clauses can be 
broken into clauses of length three, yielding a new formula that can be satisfied if 
and only if the original can. 

For the stable set problem, we are given a graph G and a bound &, and asked 
to decide if a(G) > k; that is, do there exist k pairwise non-adjacent nodes in 
G? It is easy to see that this problem is in WY, and to show that it is complete, 
we reduce 3-SAT to it and invoke Theorem 2.2. Given a 3-SAT instance ¢, we 
construct a graph G as follows: for each clause in @, let there be three nodes in G, 
each representing a literal in the clause, and let these three nodes induce a clique 
(i.e., they are pairwise adjacent); complete the construction by making adjacent 
any pair of nodes that represent a literal and its negation, and set k to be the 
number of clauses in @. If there is a satisfying assignment for ¢, pick one literal 
from each clause that is given the assignment true; the corresponding nodes in G 
form a stable set of size k. If there is a stable set of size k, then it must have 
exactly one node in each clique corresponding to a clause. Furthermore, the nodes 
in the stable set cannot correspond to both a literal and its negation, so that we 
can form an assignment by setting to true all literals selected in the stable set, and 
extending this assignment to the remaining variables arbitrarily. This is a satisfying 
assignment. This is a characteristic reduction, in that we build gadgets to represent 
the variables and clause structure within the framework of the new problem. The 
NP-completeness of two other problems follow immediately: the clique problem, 
given a graph G and a bound k, decide if there is a clique in G of size k; and the 
node cover problem, given a graph G and a bound k, decide if there exists a set of 
k nodes such that every edge is incident to a node in the set. A somewhat more 
complicated reduction transforms 3-SAT into the hamiltonian circuit problem to 
show that to be YW P-complete. A seemingly slight generalization of bipartite graph 
matching, the 3-dimensional matching problem, can be shown to be VP-complete: 
given disjoint node sets A, B and C, and a collection # of hyperedges of the form 
(a,b,c) where a € A, b € B and c € C, does there exist a subset of ¥ such that each 
node is contained in exactly one edge in the subset? 

If we restrict the stable set problem to a particular constant value of k (e.g., is 
a(G) < 100?), then this problem can be solved in P by enumerating all possible 
sets of size 100. In contrast to this, Stockmeyer has shown that the 3-colorability 
problem is YW #-complete, by reducing 3-SAT to it. Let @ be a 3-SAT formula. We 
construct a graph G from it in the following way. First, construct a “reference” 
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Figure 1. 


clique on three nodes, called true, false and undefined; these nodes will serve 
as a way of naming the colors in any 3-coloring of the graph. For each variable 
in , construct a pair of adjacent nodes, one representing the variable, and one 
representing its negation, and make them both adjacent to the undefined node. 
For each clause, /) Vf) V ds, construct the subgraph shown in fig. 1, where the nodes 
labeled with literals, as well as false (F) and undefined (U), are the nodes already 
described. It is easy to see that if @ has a satisfying assignment, we can get a proper 
3-coloring of this graph as follows: color the nodes corresponding to literals that are 
true in the assignment with the same color as is given node true; color analogously 
the nodes for false literals; and then extend this coloring to the remaining nodes 
in a straightforward manner. Furthermore, it involves only a little case-checking to 
see that if the graph is 3-colorable, then the colors can be interpreted as a satisfying 
assignment. 

The integer programming problem, defined as follows, is V2-complete: given 
an m xn matrix A and an m-vector b, decide if there exists an integer n-vector x 
such that Ax < b. In this case, finding a reduction from 3-SAT is trivial: given a 
formula ¢, represent each literal by an integer variable bounded between 0 and 1, 
and for each Boolean variable x, constrain the sum of the variables corresponding 
to x and its negation to be at most 1. The construction is completed by adding a 
constraint for each clause that forces the variables for the literals in the clause to 
sum to at least 1. On the other hand, to show that the problem is in WY requires 
more work, involving a calculation that bounds the length of a “smailest” solution 
satisfying the constraints (if one exists at all). 

The subset sum problem is also NP-complete: given a set of numbers S$ and a 
target number f, does there exist a subset of S that sums to #? This is the first 
problem that we have encountered that is a “number problem” in the sense that it 
is not the combinatorial structure, but rather the numbers that make this problem 
hard. If the numbers are given in unary, there is a polynomial-time algorithm 
(such an algorithm is called pseudo-polynomial): keep a (large, but polynomially 
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bounded) table of all possible sums obtainable using a subset of the first i numbers; 
this is trivial to do for é = 1, and it is not hard to efficiently find the table for i+1 
given the table for i; the table for i =n gives us the answer. There are “number 
problems” that remain W?-complete, even if the numbers are encoded in unary; 
such problems are called strongly NP-complete. One example is the 3-partition 
problem: given a set S of the 3n integers that sum to nB, does there exist a partition 
of § into n sets, T;,..., Tn, where each T; has three elements that sum to B? 


2.2. The complexity class NP \ P? 


If we are to believe the conjecture that ? # NP, then there exists a non-empty 
complexity class WY \ P. One might ask the question: is it true that every problem 
in WF is either NP-complete or in P? If P = NF this question has a (trivial) 
affirmative answer, but a negative answer to it (under the assumption that ? #4 
N'Y) might help explain the reluctance of certain problems to be placed in one of 
those two classes. In fact, Ladner has shown, under the assumption that ? 4 VP, 
that there is an extremely refined structure of equivalence between the two classes, 
Y and NP-complete. 


Theorem 2.4. /f L, iy decidable and L, ¢ Y, then there exists a decidable language 
Ly such that Ly ¢ f, LyX%pL} but Lik Ly. 


Note that if L,; is NP-complete, the language L, is ‘in between” the classes 
P and NP-complete, if 2 4 NY, and by repeatedly applying the result we sec 
that there is a whole range of seemingly different complexity classes. Under the 
assumption that is different from WY, do we know of any candidate problems 
that may lie in this purgatory of complexity classes? The answer to this is “maybe”. 
We will give four important problems that have not been shown to be either in ? 
or NP-complete. The problem for which there has been the greatest speculation 
along these lines is the graph isomorphism problem: given a pair of graphs G, = 
(Vi,£1) and G2 = (V2, £2), decide if there is a bijection a: V,++ V2 such that 
ij € E, if and only if o(i)o(j) € E2. Later in this section we will provide evidence 
that it is not NP-complete, and efforts to show that it is in Y have so far fallen 
short (see chapter 27 of this volume). A problem that mathematicians since the 
ancient Greeks have been trying to solve is that of factoring integers; a decision 
formulation that is polynomially equivalent to factoring is as follows: given an 
integer N and a bound k, does there exist a factor of N that is at most k? A 
problem that is no harder than factoring is the discrete logarithm problem: given a 
prime p, a generator g and a natural number x < p, find / such that g! = x mod p. 
Finally, there is the shortest vector problem, where we are given a collection of 
integer vectors, and we wish to find the shortest vector (in the Euclidean norm) 
that can be represented as a non-zero integral combination of these vectors. Here, 
current evidence makes it seem likely that this problem is really W&-complete; 
related work is discussed elsewhere in this volume (see chapter 19). 

It is important to mention that there is an important subclass of WY which may 
also fall in this presumed gap. Edmonds’ class of well-characterized problems, 
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NP Nco-N PY, certainly contains and is contained in NY. Furthermore, unless 
NP =co-NP, it cannot contain any YPY-complete problem. On the other hand, 
the prevailing feeling is that showing a problem to be in this class is a giant step 
towards showing that the problem is in . A result of Pratt shows that primality 
is in NY, so it lies in NP Nco-N*, though as discussed above, there is additional 
evidence that it lies in ?. The factoring problem, which appears to be significantly 
harder, also lies in NW? Mco-N, since a prime factorization can be guessed along 
with a certificate that each of the factors is indeed prime. One interesting open 
question connected with WA Nco-N is concerned with the existence of a prob- 
lem that is complete for this class. One might hope that there is some natural 
problem that completely characterizes the relationship of NP Nco-NY with P 
(in the same manner that 3-SAT characterizes the 2 = NY question). 

One approach to shedding light on the complexity of a problem that is not 
known to be either in Y or .W‘?-complete, has been to consider weaker forms 
of completeness for NY. In fact, Cook’s notion of completeness, though tech- 
nically a weaker delinition of intractability, is no less damning. LiaxpL2 can be 
thought of as solving L, by a restricted kind of subroutine call, where first some 
polynomial-time preprocessing is done, and then the subroutine for L2 is called 
once. Cook (1971) proposed a notion of reducibility where L, is solved by using a 
polynomial-time Turing machine that can, in one step, get an answer to any query 
of the form “is x € L2?”. Note that any complete language L with respect to this 
reducibility still has the property that L € ¥ if and only if P = VY. Karp (1972) 
focused attention on the notion Xp, and was able to show that “&-completeness 
was powerful enough to capture a wide range of combinatorial problems. On the 
other hand, it remains an open question to show that Cook’s notion of reducibility 
is stronger than Karp’s; is there a natural problem in W that is complete with re- 
spect to “Cook” reducibility, but not with respect to “Karp” reducibility? Adleman 
and Manders showed that non-determinism and randomization can play a role in 
defining notions of reducibility, and used these notions to show that certain number 
theoretic problems were not, for example, in VP Nco-NY unless NP = co-NF. 


2.3. Oracles and relativized complexity classes 


In previous subsections, we have discussed techniques to provide evidence of the 
intractability of concrete problems in WY by proving completeness results. In this 
section, we will be concerned with extensions of the model of computation where 
the analog of the versus VY problem can be resolved. 

We have mentioned earlier that oracles are used to compactly represent the input 
of some problems. One can define the analog of the classes ? and N'Y for problems 
whose inputs are oracles and it is easy to prove that these complexity classes are 
different. For example, if the oracle represents a set system on n elements, then 
the problem to decide if this set system is nonempty is clearly in WY, but one 
has to ask 2” queries of the oracle to resolve it deterministically. Similar results 
have also been proved for combinatorial problems that are naturally represented 
by oracles. For example, the matroid matching problem and the problem to decide 
if a matroid has girth at most k (i.e., whether it has a cycle of length at most k) 
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are both clearly in WY, but it has been shown that any deterministic algorithm 
solving them has to ask an exponential number of queries (see chapter 11). 

Problems on graphs can also be given by an oracle. Suppose that a graph (e.g., 
on N = 2" nodes) is given by an oracle that can tell whether two nodes are adja- 
cent. The question is whether all “reasonable” decision problems on graphs require 
one to ask some constant fraction of the queries? This problem has a long history, 
both for directed and undirected graphs, and many attempts were made at giv- 
ing sufficiently strong conditions before an accurate conjecture, due to Aanderaa 
and Rosenberg, was proved by Rivest and Vuillemin, and later strengthened by 
Kahn, Saks and Sturtevant. Consider a decision problem L where the instances 
are undirected graphs, and L has three important properties: (1) nontriviality - 
some graphs are in L, but not all; (2) monotonicity — if G is an instance in L 
and G’ is formed by adding an edge, then G’ is also in L ; (3) invariance under 
isomorphism — if G is an instance in L and its nodes are relabeled to form G’, then 
G' is also in L. Then for any problem satisfying these two properties, any correct 
procedure uses essentially N*/4 queries for some graph of order N. In fact, if N is 
a prime power, Kahn, Saks and Sturtevant have shown that all N(N — 1)/2 queries 
must be asked (see chapter 34). These results are also relevant in comparing the 
adjacency-matrix form of encoding graphs to the adjacency-list encoding. 

Another extension of the classes ? and WY uses oracles as a source of com- 
putational power rather than as a source of information. For a given language A 
we shall consider oracle Turing machines that, during the computation, may ask 
queries of the form: “is x € A?”. Here, the oracle A is considered as part of the 
model of computation, rather than as part of the input. This notion of an oracle 
can help, for example, in understanding the relative difficulty of some problems. 

For a language A, we denote by #4 and NV * the relativized analogs of ? 
and N® that are defined by Turing machines that use an oracle that decides 
membership in A. In general, the relativized analog of a complexity class ¢ is 
denoted by c*. The main result concerning oracle complexity classes, due to Baker, 
Gill and Solovay (1975), is that the answer to the relativized version of the P = 
NY problem depends on the oracle. The intuition for the first alternative of the 
following theorem is given by the casc when the oracle is an input. This is made 
rigorous by diagonalizing over all oracle Turing machines to construct an oracle A 
such that the language L(A) = {1": dx € A such that |x| =} is not in P4. 


Theorem 2.5. There exist languages A and B such that PA # N Pp # co-N P* and 
ps — NP? = co-NP", 


Several of the theorems and proof techniques discussed in this survey easily 
extend to relativized complexity classes (i.e., they relativize). As a corollary to the 
above theorem, we can assert that techniques that relativize cannot settle the P 
versus VY problem. 

One might wonder which one of the two alternatives provided by the theorem 
of Baker et al. is more typical. We define a random language A to be one that 
contains each word x independently with probability 5. We say that a statement 
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holds for a random oracle if the probability that the statement holds for a random 
language A in place of the oracle is 1. The Kolmogorov 0-1 law of probability 
theory states that if an event A is determined by the independent events, 6, 
By, ... and A is independent of any event that is a finite combination of the events 
B;, then the probability of A is either 0 or 1. Applying this to the event A that 
yA — NY and the events 8; (hat the ith word is in the language A, we see that 
the probability of P4 =N @* for a random oracle A is either 0 or 1. Bennett and 
Gill have provided the answers to these questions. 


Theorem 2.6. 94 4 NP* and NP* ¢ co-N pA for a random oracle A. 


2.4. Evidence of intractability: PsP Acé-completeness 


In this subsection, we turn to the question of the space complexity of problems. 
When we discuss space complexity, we may assume that the Turing machine has 
only one work tape (and a separate input tape), since by expanding the tape 
alphabet, any number of tapes can be simulated by one tape without using more 
space. We shall also assume that the Turing machine halts in a unique configuration 
when accepting the input, e.g., it erases its work tape, moves both heads to the 
beginning of the tapes, and enters a special accepting state. 

We remarked when introducing PsPAce that it was not an oversight that 
NPSPACE was not defined, since PSPACE = NPSPACE. This result is a special case 
of the following theorem of Savitch (1970). 


Theorem 2.7, If L is accepted by a nondeterministic Turing machine using space 
S(n) > logn, then it is accepted by a deterministic Turing machine using space S(n)°. 


Proof. The proof is based on the idea of modeling the computation by a directed 
reachability problem and using a natural divide-and-conquer strategy. In any given 
computation, a nondeterministic Turing machine M can be completely described 
by specifying the input head position, the contents of the work tape, the work- 
tape head position and the current state. Consider the directed graph G whose 
nodes correspond to these configurations and whose arcs correspond to possible 
transitions of M. The Turing machine accepts the input if and only if there is a 
path in G from the starting configuration to the (unique) accepting configuration. 

Since M uses S(n) space, there are at most d*) configurations for some con- 
stant d and hence the length of a simple path in G is at most d°", To solve the 
reachability problem in G, we build a procedure that recursively calls itself to 
check for any nodes i and k of G, and a bound /, whether there is a path from 
i to k of length at most 2!. There is such a path if there exists a midpoint of the 
path j such that / can be reached from i, and k can be reached from j by paths 
of length at most 2'~'. The existential quantifier can be implemented by merely 
trying all possible nodes j in some specified order. The basis of this recursion is 
the case / = 0, where we merely need to know if i = 4, or if i and & are connected 
by an arc of G, and this can be easily checked in O(S(n)) space. In implementing 
this procedure, we need to keep track of the current midpoint at each level of the 
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recursion, and so we need /S(m) space to do this. To simulate M by a deterministic 
Turing machine, we run the procedure for the graph G with / = S(n) log d and the 
nodes corresponding to the starting and the accepting configurations. 


A problem L, is called PsP.Acé-complete if L; © PsP Ace and for all L € PSPACE, 
LoxpL. The simplest PsPAcé-complete problem is the problem of determining tf 
two nodes are connected in a directed graph G (of order 2”) that is given by a 
circuit with 2” inputs where the first 1 specify in binary a node i and the second n 
specify a node j, and the output of the circuit is 1 if and only if (i,j) is an arc of 
G. This problem can easily be solved by a non-deterministic Turing machine using 
polynomial space hence it is in PSPACE. To prove that it is complete, consider a 
language L € PsPaAce. L can be reduced to this circuit-based directed reachability 
problem by introducing the graph G used in the proof of Savitch’s theorem. It is 
easy to construct in polynomial time a (polynomial-size) circuit that decides if one 
configuration follows another by onc transition of M. 

The idea of “guessing a midpoint” is also the main idea used to derive 
another PsPAcé-complete problem. The problem of the validity of quantified 
Boolean formulae is as follows: given a formula in first-order logic in prenix form, 
AxyVx2---Opx,b(x1,-..,X,) = true, decide if it is valid. This problem is clearly in 
PSPACE. The proof of its completeness is a mixture of Theorem 2.7 and Theo- 
rem 2.3. We use the alternation of existential and universal quantifiers to capture 
the notion of the existence of a midpoint such that both (for all) the first and 
second halves of the computation are legitimate. The basis of the recursion is now 
solved by building a Boolean formula to express that either two configurations are 
the same or one is the result of a single transition from the other. 

An instance of the previous problem can be viewed as a game between an 
existential player and a universal player; the existential player gets to choose values 
for x,, then the universal player chooses for x2, and so on. The decision question 
amounts to whether the first player has a strategy such that @ must evaluate to 
true. There are many other PsPAcé-complete problems known, and most of these 
have a game-like flavor. An example of a more natural PsPAcé-complete game is 
the directed Shannon switching game: given a graph G with two nodes s and ¢, 
two players alternately color the edges of G, where the first player, coloring with 
red, tries to construct a red path from s to ¢, whereas the second player, coloring 
with blue, tries to construct a blue (s,t)-cut; does the first player have a winning 
strategy for G? Note that this result is in stark contrast to the undirected case, 
which can be solved in polynomial time (see chapter 11). 

The role of games in PsPAcé-completeness suggests a new type of Turing ma- 
chine, called an alternating Turing machine, which was originally proposed by 
Chandra, Kozen and Stockmeyer (1981). Consider a computation to be a se- 
quence of moves made by two players, an existential player and a universal 
player. The current state indicates whose move it is, and in each configuration 
the specified player has several moves from which to choose. Each computa- 
tion path either accepts or rejects the input. The input is accepted if the ex- 
istential player has a winning strategy, that is, if there is a choice of moves 
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for the existential player, so that for any choice of moves by the universal 
player, the input is accepted. For simplicity, assume again that each compu- 
tation path has the same number of moves. As before, the time to accept 
an input x is this number of moves, and the space needed to accept x is 
the maximum space used on any computation path. Observe that a nondeter- 
ministic machine is an alternating machine where only the existential player 
moves. 

The role of PsPAcé in the definition of these machines suggests that alternat- 
ing polynomial time is closely related to PsPAcé and indeed this is just a special 
case of a general phenomenon. Let ATIME(7(n)) and ASPACE(S(n)) denote 
the classes accepted by an alternating Turing machine with O(7T(n)) time and 
O(S(n)) space, respectively. Chandra, Kozen and Stockmeyer proved two funda- 
mental results characterizing the relationship of alternating classes to deterministic 
and nondeterministic ones. Note that the first result, in essence, implies Savitch’s 
theorem, and in fact, the proof of the last inclusion of Theorem 2.8 is very similar 
to the proof of Savitch’s theorem. 


Theorem 2.8. /f T(n) > 1n, then 
ATIME(T(n)) C DSPACE(T (n)) C NSPACE(T (n)) C ATIME(T(n)’). 
Theorem 2.9. If S(n) > logn, then ASPACE(S(n)) = U9 DTIME(O™). 


Among the consequences of these results, we see that AP = PSP.ACEé. One can view 
an alternating Turing machine as an extremely idealized parallel computer, since 
it can branch off an unbounded number of parallel processes that can be used 
in determining the final outcome. Therefore, one can consider these results as a 
proven instance of the parallel computation thesis: parallel time equals sequential 
space (up to a polynomial function). 


2.5. The polynomial-time hierarchy 


The definition of PsPAcé using alternating Turing machines suggests a hierarchy 
of complexity classes between NY and PsPAcé, called the polynomial-time hier- 
archy. The kth level of the polynomial-time hierarchy can be defined in terms of 
polynomial-time alternating Turing machines where the number of “alternations” 


between existential and universal quantifiers is less than k. Equivalently, we can 
define 


LP — {LIAM © Psuch that x ¢ 1 if and only if 
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where Q, is 4, if k is odd, and V, if k is even. Note that 32 = Y and &? = NP. We 
also define II? to be co-2?. Clearly, 3? UNI? C3? ,. The following generalized 
coloring problem gives a natural example of a problem in %9. Given input graphs 
G and H, can we color the nodes of G with two colors so that the graph induced 
by each color does not contain H as a subgraph? 
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Alternatively, it is possible to define the polynomial-time hierarchy in terms of 
oracles, as was done in the original formulation by Meyer and Stockmeyer (1972). 
For complexity classes @ and @ let 6” denote the union of €“ over all A € Q. 
Consider again the generalized coloring problem; it is not hard to see that there 
is a nondeterministic polynomial-time Turing machine to solve it, given an oracle 
for the folowing problem A: given G and H, decide if H is a subgraph of G. Since 
A€ NY, we see that this coloring problem is in NP". In fact, X = NP*” and 


H - 
in general, Dah =NP*, Unfortunately, for each new complexity class, there is 
yet another host of unsettled questions. os 


Open Problems. For each k > 1, is 2f = 7’ ,? For each k > 1, is YP = II?? 


Contained in these, for k = 1, are the P=NY and NYP =co-NFY questions, 
and as was true for those questions, one might hope to find complete problems on 
which to focus attention in resolving these open problems. We define these notions 
of completeness with respect to polynomial-time reducibility, so that L is complete 
for € if and only if L € € and all problems in © reduce («,) to L. As might be 
expected, analogs of the satisfiability problem which allow a particular number of 
alternations in the formulae can be used to provide a complete problem for each 
level of the hierarchy. On the other hand, it is more satisfying to have more natural 
complete problems, and Rutenberg showed that the generalized coloring problem 
is, in fact, complete for xP . Another problem of identical complexity is a similarly 
flavored node-deletion problem: given graphs G and H and an integer k, decide 
if there is a subset of k nodes that can be deleted from G, so that the remaining 
graph no longer has H as a subgraph. 

One piece of good news concerning this infinite supply of open problems, is that 
their answers may be related. There is a principle of upward inheritability that says 
that if 2”? =I? for some level /, then equality holds for all levels k > J. In fact, 
&,” = II? implies that the entire hicrarchy collapses to that level; i.c., ? = 2 for 
all k >/. Note that ? = W if and only if P = PH, where PH =U, 592;- 

As we shall see, the polynomial-time hierarchy has helped to provide insight 
into the structure of several complexity classes. Perhaps the first result along these 
lines is due to Sipser, who used a beautiful “hashing function” technique to show 
that BPP is in the polynomial-time hierarchy, and in fact, can be placed within 
2? nly: 

Furst, Saxe and Sipser (1981) discovered an interesting connection between 
constant-depth circuit lower bounds and separating relativized complexity classes. 
In particular, they showed that there are important conscquences of an exponcn- 
tial lower bound on the size of a constant-depth circuit for the parity function, the 
sum modulo 2 of the bits of the input. Based on earlier results of Furst, Saxe and 
Sipser and Ajtai, Yao (1985) proved a sufficiently strong lower bound to yield the 
following theorem. 


Theorem 2.10. There exists an oracle A that separates the polynomial-time hierarchy 
from PSPACE; that is, Jy, %)'" # PSPACE*. 
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The idea of the proof ts as follows. First one shows that it is sufficient to con- 
sider alternating Turing machines in which every branch of the computation has 
a single oracle question at the end of the branch. The computation tree of such 
an alternating Turing machine with k levels of alternation corresponds to a depth- 
k circuit where the oracle answers are the inputs. For any oracle A, we define 
the language L(A) = {1"|A contains an odd number of strings of length n}. Now, 
L(A) is in pspAce’ for any oracle A. Using diagonalization and the result that 
for any constant c, a constant-depth circuit that computes the parity function has 
O(n'°8'") gates, one can construct an oracle A such that L(A) is not in U,., yee 

Another related problem is to separate the finite levels of the polynomial-time 
hierarchy using oracles. Baker and Selman proved the existence of an oracle A 
such that 33°“ 411)“ (and consequently, 23" 4 3°“). Sipser showed that an 
oracle that separates finite levels of the polynomial-time hierarchy can be obtained 
via a connection to lower bounds on the size of constant-depth circuits, similar to 
the one used in Theorem 2.10. The following theorem is based on this as well as 
the requisite lower bounds for a certain function F,, which is in some sense the 
generic function that is computable in depth k; these lower bounds were obtained 
through a series of results by Sipser, Yao and Hastad. 

Theorem 2.11. For every k, there exists an oracle A, such that te # bai Mas 

Techniques for proving the circuit-complexity lower bounds used in Theo- 
rems 2.10 and (2.11) are discussed in chapters 33 and 40. Based on stronger circuit- 
complexity results, Babai and Cai separated psPAce from the finite levels of the 
hierarchy by random oracles. The question as to whether random oracles sep- 
arate the finite levels of the polynomial-time hierarchy remains open. Another 
open question concerning random oracles is whether 4 = W P* Aco-NP" for a 
random oracle A. 


2.6. Evidence of intractability: #P -completeness~ 


Consider the counting problem of computing the number of perfect matchings in 
a bipartite graph. If this is cast as a decision problem, it does not appear to be in 
N®, since it seems that the number of solutions can only be certified by writing 
down possibly exponentially many matchings. However, consider modifying the 
definition of NY to focus on the number of good certificates, rather than the 
existence of one; let #P be the class of problems for which there exists a language 
L' € ¥ and a polynomial p(n) such that for any input x, the only acceptable output 
is z = \{y: |y| = p(|x}) and (x,y) € L’}}. Clearly, the problem of counting perfect 
matchings in a bipartite graph is in #P. Any problem in #7 can be computed 
using polynomial space, since ail possible certificates y may be tried in succession. 
A recent result of Toda gives evidence of the intractability of this class by showing 
that pc ph? 

We can also define a notion of a complete problem for #. To do this, we use 
a reduction analogous to that used by Cook. Let P, and P, be counting problems, 
where the outputs for input x are denoted by P,(x) and /2(x), respectively. The 
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problem P, reduces to P, if there exists a polynomial-time Turing machine that 
can compute P, if it has access to an oracle that, given x, can compute P2(x) in 
one step. We define a counting problem to be #%-complete if it is in #Y, and all 
problems in # reduce to it. A stronger notion of reducibility, analogous to the 
notion used by Karp, is called parsimonious: an instance x of P, is mapped in 
polynomial time to f(x) for Pz, such that P;(x) = P(f(x)). For either notion of 
reducibility, we see that any polynomial-time algorithm for P, yields a polynomial- 
time algorithm for P,. It is not hard to see that the proof of Cook’s theorem shows 
that the problem of computing the number of satisfying assignments of a Boolean 
formula in conjunctive normal form is #-complete. Furthermore, by being only 
slightly more careful, it is easy to give parsimonious modifications of the reductions 
to the clique problem, the hamiltonian circuit problem, and seemingly any V?- 
complete problem, and so the counting versions of NW P-complete problems can be 
shown to be ##-complete. 

Surprisingly, not all #?-complete counting problems need be associated with an 
N #-complete problem. Computing the number of perfect matchings in a bipartite 
graph (or equivalently, computing the permanent of a (0,1)-matrix) is a counting 
version of a problem in , the perfect matching problem, and yet Valiant (1979) 
has shown that this problem is #%-complete. This made it possible to prove that 
a variety of counting problems are #?-complete although they are based on poly- 
nomially solvable decision problems. 

There are many problems that are essentially #-complete, although they do 
not have the appearance of a counting problem. An important example is the 
problem of computing the volume of a convex body. In the special case when the 
body is a polytope given by a system of linear inequalities, this was shown to be 
#2-hard by Khachiyan and by Dyer and Frieze. For some problems, computing a 
good estimate suffices, and this might be much easier. If the convex body is given 
by an oracle that answers whether a given point is feasible, and if not produces a 
separating hyperplane (see chapter 28), then it is most natural to only estimate the 
volume. However, Barany and Fiiredi, extending work of Elekes, proved that if U 
and L are upper and lower bounds on the volume of a d-dimensional convex body 
that are computed by an algorithm that makes only a polynomial number of calls to 
the oracle, then U/L = Q((d/logd)“). Surprisingly, randomization can overcome 
this intractability. Dyer, Frieze and Kannan showed that there is a randomized 
algorithm that, given any ¢,6 > 0, computes upper and lower bounds U and L 
such that U/L < 1+, uses a number of calls to the oracle that is bounded by 
polynomial in d, !/¢, and log(1/5), and is correct with probability at least 1 — 6. 

Another problem that has been shown to be intimately connected with the es- 
timation of the value of counting problems is that of uniformly generating combi- 
natorial structures. As an example, suppose that a randomized algorithm requires 
a perfect matching in a bipartite graph G to be chosen uniformly from the set of 
all perfect matchings in G; how can this be done? Jerrum, Valiant and Vazirani 
have. shown that a relaxed version of this problem, choosing perfect matchings 
that are selected with probability within an arbitrarily small constant factor of the 
uniform valuc, is equivalent to the problem of estimating the number of perfect 
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matchings (using a randomized algorithm). In fact, their result carries through to 
most counting/generation problems related to problems in WV, since it requires 
only a natural self-reducibility property that says that an instance can be solved 
by handling some number of smaller instances of the same problem. 

Stockmeyer has provided insight into estimating the value of #? problems, by 
trying to place these problems within the polynomial-time hierarchy. Using Sipser’s 
hashing function technique, Stockmeyer showed that for any problem in #? and 
any fixed value of d > 0, there exists a polynomial-time Turing machine with a 


%%-complete oracle that computes a value that is within a factor of (1+n~¢) of 
the correct answer. 


2.7. Proof of intractability 


It is, of course, far preferable to prove that a problem is intractable, rather than 
merely giving evidence that supports this belief. Perhaps the first natural question 
is, are there any decidable languages that require more time than 7 (n)? The diag- 
onalization techniques used to show that there are undecidable languages can be 
used to show that for any Turing-computable T(), there must exist such a lan- 
guage; consider the language L of all i such that if i is run on the Turing machine 
M; that i encodes, either it is rejected, or it runs for more than 7(n) steps. It is 
easy to see that L is decidable, and yet no Turing machine that always halts within 
T(n) steps can accept L. Using our stronger assumption about T(n) (full time- 
constructibility) we are able to define such a language that is not only decidable, 
but can be recognized within only slightly more time than T(n). The additional 
logarithmic factor needed in Theorem 2.12, which combines a result of Hartmanis 
and Stearns (1965) with one of Hennie and Stearns, is due to the fact that the 
fastest known simulation of a (multitape) Turing machine by a machine with some 
constant number of tapes slows down the machine by a logarithmic factor. 


Theorem 2.12. If 


T,(n)log Ti(n) _ 


lim inf = 0, 


n=100 T,(n) 
then there exists L € DTIME(T2) \ DTIME(T)). 


Since any multitape machine can be simulated by a 1-tape machine without using 
any additional space, the corresponding space-hierarchy theorem of Hartmanis, 
Lewis and Stearns does not require the logarithmic factor. Scifcras, Fischer and 
Meyer have proved an analogous, but much more difficult, hierarchy theorem for 
nondeterministic time. 


Theorem 2.13. If 


. . 2 Ti(nt+1) 
lim inf —~——— = 0, 
nooo T2(n) 


then there exists L € NTIME(T>) \ NTIME(7}). 
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Although the proofs of these theorems are non-constructive, Meyer and Stock- 
meyer developed the following strategy that makes use of completeness results in 
order to prove lower bounds on particular problems. Intuitively, a completeness 
result says that a problem is a “hardest” problem for some complexity class, and 
Theorems 2.12 and 2.13 can be used to show that certain complexity classes have 
provably hard problems. Consequently, these two pieces imply that the complete 
problem is provably hard. 

As an example, consider the circuit-based large clique problem, L.,, which is the 
problem analogous to the circuit-based directed reachability problem, that tests 
whether the (compactly represented) input graph on N = 2" nodes has a clique 
of size N/2. This problem is complete for the class VexpTIMe = Uj, NTIME(2”) 
with respect to polynomial-time reducibility. This can be seen by introducing the 
circuit-based version of satisfiability; a formula is represented by a polynomial- 
size circuit that outputs 1 for input (é,j/) when literal /; is in the jth clause, where 
i and j are given in binary. The generic reduction of Cook’s theorem translates 
exactly to show that circuit-based salisfiability is WeaPTZMeé-complete, and the 
completeness of L., follows by using essentially the same reduction used to show 
the WY-completeness of the ordinary clique problem. Now consider a language 
L € NTIME(2” ) — NTIME(2") specified by Theorem 2.13. Since LxpL ce, then 
Lec € NTIME(2”) implies that L € NTIME(p(n) + 2?"), where p(n) = n* is the 
time bound for the reduction. By choosing c = 1/k, we obtain the following theo- 
rem. 


Theorem 2.14. There exists a constant c > 0 such that Lo. ¢ NTIME(2"). 


One interpretation of this result is that there exist graphs specified by circuits 
such that proofs that these graphs have a large clique require an exponential 
number of steps (in terms of the size of the circuit). Observe that we would have 
been able to prove a stronger result if we had had a better bound on the length 
of the string produced by the reduction. Lower bounds proved using this strategy 
are often based on completeness with respect to reducibilities that are further 
restricted to produce an output of length bounded by a linear function of the 
input length. 

This strategy has been applied primarily to problems from logic and formal 
language theory. A result of Fischer and Rabin (1974) that contrasts nicely with 
Theorem 1.3 concerns Lpa, the language of all provable sentences in the theory 
of arithmetic for natural numbers without multiplication, which was shown to be 
decidable by Presburger. 


Theorem 2.15. There is a constant c > 0 such that Lp, ¢ NTIME(2*"). 
A representative sample of problems treated in this fashion is surveyed by Stock- 
meyer (1987). 

There are two other important results that separate complexity classes. Hopcroft, 
Paul and Valiant (1977) showed that DTIME(T(n)) C DSPACE(T (n)/log T(71)), 
and thus time is not the same complexity measure as space. Paul, Pippenger, 
Szemerédi and Trotter (1983) showed that DTIME() 4 NTIME(n). 
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2.8. Extensions of NP: Short proofs via randomization 


In the same way that randomized algorithms give rise to an extended notion of 
efficiently computable, randomized proofs give rise to an extension of V9, the 
class of languages for which membership is efficiently provable. Randomized proofs 
give overwhelming statistical evidence. In creating this branch of complexity theory, 
Babai (1985) and Goldwasser, Micali and Rackoff (1989) have given definitions to 
capture related notions of proof based on statistical evidence. The interested reader 
is directed to the surveys by Goldwasser (1989) and Johnson (1988). 

Suppose that King Arthur has two graphs G, and G), and his magician Mer- 
lin wants to convince him that the two graphs are not isomorphic. The graph 
isomorphism problem is not known to be in co-NY, but Goldreich, Micali and 
Wigderson have given the following way for Merlin to convince Arthur that the 
two graphs are not isomorphic. Merlin asks Arthur to choose one of the two 
graphs, randomly relabel the nodes, and show him this random isomorphic copy 
of the chosen graph; Merlin will tell which graph Arthur chose. If the two graphs 
are isomorphic, then Merlin has only a fifty percent chance of choosing the right 
graph, assuming that he cannot read Arthur’s mind. If Merlin can successfully re- 
peat this test several times, then Arthur can fairly safely conclude that Merlin can 
distinguish isomorphic copies of the two graphs; in particular, the two graphs arc 
not isomorphic. 

The above randomized proof is a prime example of the interactive proof defined 
by Goldwasser, Micali and Rackoff. An interactive protocol consists of two Tur- 
ing machines: the Provcr (Merlin) which has unrestricted power and the Verifier 
(Arthur) which is restricted to be a randomized polynomial-time Turing machine. 
The two machines operate in turns, and communicate only between turns via a 
special tape. The Prover is trying to convince the Verifier that a common input x 
is in a language L. A language L has an interactive proof system if there exists an 
interactive protocol such that if x € L, then the Verifier accepts with probability 
at least 2; if x ¢ L, then for any Turing machine used in place of the Prover, the 
Verifier rejects with probability at least z. Let zP denote the class of languages 
that have an interactive proof system with a polynomial number of turns. Note 
that, once again, the choice of probability 2 is arbitrary. 

Babai has proposed a similar, but seemingly weaker, version of a randomized 
proof system, an Arthur—Merlin game, in trying to capture a complexity class just 
above WP. It can be defined in the same way as an interactive proof system with 
the assumption that each machine may access the other’s work and randomizing 
tapes. Note the importance of privacy in the above interactive protocol. Gold- 
wasser and Sipser proved, however, that interactive proofs and Arthur-Merlin 
games define the same complexity class. 

One can define randomized proof hierarchies in a way analogous to the 
polynomial-time hierarchy. We consider the class of languages accepted by an 
interactive proof system (or an Arthur-Merlin game) with less than k alterna- 
lions of turns. Babai introduced 4M to denote the class of languages that have 


Computational complexity 1631 


an Arthur-Merlin game where a single move of Arthur is followed by a single 
move of Merlin. It would be natural to conjecture that these hierarchies are strict. 
However, Babai proved that if a problem has an Arthur—-Merlin game with a finite 
number of turns than it is in AM. Since the equivalence between interactive proofs 
and Arthur-Merlin games increases the number of turns only by a constant, the 
same is true for interactive proofs with a finite number of turns. 

On the other hand, it appears that 7P, which allows a polynomially bounded 
number of alternations, is a significantly richer class than AM. Lund, Fortnow, 


Karloff and Nisan showed that 7% ¢ 7P, and Shamir extended their techniques to 
prove the following theorem. 


Theorem 2.16. ZP = PSPACE. 


One way to view this result is that it is possible to convince someone of a theo- 
rem in polynomial time, if it can be proven using a polynomial-sized blackboard. 
An interesting aspect of these results is that they do not relativize, since, for ex- 
ample, Fortnow and Sipser have constructed an oracle A for which co-W FP” is not 
contained in zp’. There is evidence that AM is a more restrictive class. Just as 
BPP C P/poly, one can show that AM is contained in NPY /poly, a non-uniform 
extension of “%?. Babai has shown that AM C Il? by extending the proof that 
BPP CLIN I?. It appears that AM does not contain co-VP, but to prove this 
would imply that VP 4 co-NY. Nonetheless, the following theorem, due to Bop- 
pana, Hastad and Zachos, provides some evidence. 


Theorem 2.17. Jf co-NP CAM then a =I? = AM. 


We have seen that the graph non-isomorphism problem is in AM. Therefore, 
Theorem 2.17 implies that if the graph isomorphism problem is W?-complete, 
then &? =I? = am. 

Goldwasser, Micali and Rackoff introduced the interactive proof system in order 
to characterize the minimum amount of “knowledge” that needs to be transferred 
in a proof. Interactive proofs make it possible to “prove”, for example, that two 
graphs are isomorphic, without giving any further clue about the isomorphism 
between them. These aspects of interactive proofs will be discussed in chapter 40. 

In the years since this survey was written, there have been quite a number of 
very significant developments beyond the results mentioned here. We have already 
mentioned the result that TP = PSPACE, this result was proved just in time to be 
included in the final revised version sent to the publisher, but it has turned out 
that this was more a beginning than a conclusion. Ben-Or, Goldwasser, Kilian, 
and Wigderson introduced an analogous notion of multiprover interactive proofs 
in which the Verifier can be convinced by several independent Provers that cannot 
communicate with each other during the protocol. Fortnow, Rompel, and Sipser gave 
an alternate characterization of this class MIP, which was used by Babai, Fortnow, 
and Lund to prove that MIP = NEXPTIME. Fiege, Goldwasser, Lovdsz, Safra, and 
Szegedy showed, using ideas from the proof that MIP = NEXPTIME, that there is 
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a fundamental connection between randomized complexity classes and proving that 
certain optimization problems are hard to solve even approximately, Extensions of 
this result by Arora and Safra and subsequently Arora, Lund, Motwani, Sudan, 
and Szegedy, led to the ultimate result along these lines: NP is exactly the class 
of languages L for which there is the following type of probabilistically checkable 
proof; for any input x, the Verifier is given a certificate of polynomial length of 
which it may query O(1) bits based on O(log |x|) random coin flips; for each x € L, 
there exists a certificate such that the Verifier always accepts; for each x ¢ L, given 
any certificate the Verifier rejects with probability at least i For a survey of these 
results, the reader is referred to Johnson (1992). 


3. Living with intractability 


The knowledge that a problem is “P-complete is little consolation for the algo- 
rithm designer who needs to solve it. Contrary to their theoretical equivalence, all 
N ¥-complete problems are not equally hard from a practical perspective. In this 
section, we will examine two approaches to these intractable problems that, while 
not overcoming their inherent difficulty, make them appear more manageable. In 
this process, finer theoretical distinctions will appear between these problems and 
will help to explain the empirical evidence. 


3.J. The complexity of approximate solutions 


Most of the 21 WY-complete problems in Karp’s original paper are decision ver- 
sions of optimization problems; this is also true for a great majority of the problems 
catalogued by Garey and Johnson (1979). Although the combinatorial nature of 
these problems makes it natural to focus on optimal solutions, for most practical 
settings in which these problems arise it is nearly as satisfying to obtain solu- 
tions that are guaranteed to be nearly optimal. In this subsection we will briefly 
survey the sorts of performance guarantees that can and cannot be obtained for 
particular combinatorial problems. For further discussion of the algorithmic tech- 
niques used in obtaining near-optimal solutions, the reader is referred to chapter 
28. Throughout this section, OPT(/) will denote the optimal value of an instance 
/ of a particular combinatorial optimization problem. 

It is possible that deciding if OPT(/) < k is VYP-complete, and yet a solution of 
value no more than OPT(/) + 1 can be computed in polynomial time. In fact, this is 
true for the edge coloring problem, where we are given an undirected simple graph 
G and an integer k, and we wish to color the edges with as few colors as possible 
so that no two edges incident to the same node receive the same color. Holyer 
showed that it is W¥-complete to decide if a cubic graph can be 3-edge-colored. 
On the other hand, Vizing proved that the minimum number of colors needed is 
at most one more than maximum degree of G, and his proof immediately yields a 
polynomial-time algorithm that uses at most OPT(/) + 1 colors. Since edge-coloring 
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graphs with chromatic index 2 is trivial, this also yields an algorithm that always 
uses no more than + - OPT(/); a polynomial-time algorithm with such an absolute 
performance guarantee is often called a 4-approximation algorithm. On the other 
hand, it is easy to see that this is the best possible performance guarantee, unless 
P = NP. Suppose that there exists a p-approximation algorithm for edge coloring 
with p < 4/3. If this algorithm is run on a cubic graph that can be colored with 
three colors, then the algorithm must return a coloring that uses fewer than four 
colors; it returns an. optimal coloring. 

One type of strong approximation result is called a filly polynomial approxima- 
tion scheme; this is a family of algorithms {A,.}, where A, is a (1 + ¢)-approximation 
algorithm and the dependence of the running time on € is bounded by a polyno- 
mial in 1/e. By solving rounded instances with only a limited number of significant 
digits, Ibarra and Kim gave such a scheme for the knapsack problem: given n pieces 
to be packed into a knapsack of size B, where piece j has size s; and value u,, 
pack a subset of pieces of total size < B with maximum total value. Note that it is 
impossible to improve the dependence of the running time on & to a polynomial 
in log(1/e) since it is always possible to choose ¢ of polynomial length, such that 
A,{1) — OPT(/) < 1, and thus by integrality, A,(/) = OPT(/). The same argument 
implies an important result of Garey and Johnson; if a problem is strongly VP- 
complete, then there is no fully polynomial approximation scheme for it unless 
P=NF. if the running time of A, may depend arbitrarily on 1/e, it is some- 
times possible to obtain such a polynomial approximation scheme for strongly 
N-complete problems. Hochbaum and Shmoys showed that this is the case for 
the following machine-scheduling problem: each of n jobs is to be scheduled on 
one of m machines, where job j takes time p; on any machine, and the aim is to 
minimize the time by which all jobs are completed. In fact, the idea of studying the 
performance guarantees of heuristics for optimization problems was first proposed 
in the context of this problem by Graham (1966), who gave a 2-approximation 
algorithm. 

It is sometimes too restrictive to focus on p-approximation algorithms. A good 
illustration of this is the bin-packing problem: given a collection of n pieces, where 
piece j has size s;, how many bins of size B are needed to pack all of the pieces? 
Since it is casy to formulate the subsct sum problem as a question of whether 
2 bins suffice, we see that a p-approximation algorithm with p < 3 would imply 
that P = NY. However, Johnson showed that a simple heuristic uses at most 
a -OPT(/) + 4 bins. This suggests that an asymptotic performance guarantee may 
also be interesting, where we consider the infimum of the absolute performance 
guarantee for instances with OPT(/) > k (as k -> 00). It was a great surprise when 
Fernandez de la Vega and Lueker not only substantially improved this 4} bound, 
but gave a polynomial approximation scheme with respect to asymptotic guaran- 
tees. Perhaps even more surprisingly, Karmarkar and Karp extended this to give 
such a scheme where the dependence of the running time on 1/e was bounded by 
a polynomial. 
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If it is possible to scale up the data to generate an cquivalent problem, such as 
using processing times M - p; in the machine-scheduling problem, any distinction 
between the absolute and asymptotic performance guarantees disappears. For node 
coloring, the following “graph composition” accomplishes this scaling: take M 
copies of the graph, and make each pair of nodes in different copies adjacent. The 
NP-completeness of 3-colorability again implies that an absolute performance 
guarantee better than 4 is unlikely. In fact, Garey and Johnson (1976) use a more 
intricate composition technique to increase this ratio, and thereby prove that an 
asymptotic performance guarantee less than 2 would imply that P = NP. 

For some problems, such as the traveling salesman problem, Gonzalez and Sahni 
observed that no constant performance guarantee is possible unless ? = VY. In 
this example, where the aim is find a minimum-length hamiltonian circuit in a com- 
plete graph where edge e has length c,, one can use a p-approximation algorithm 
to decide the hamiltonian circuit problem for a graph G = (V, E); set ce = 1 for 
e € E and p|V|+ 1 otherwise. However, Christofides has given a ?-approximation 
algorithm for instances that satisfy the triangle inequality. 

For the great majority of problems, such as node coloring, there is neither a 
constant performance guarantee nor any evidence that such an algorithm does not 
exist. In the case of the maximum stable set problem, for which the best-known al- 
gorithm has performance guarantee little better than O(n/logn), there is some ev- 
idence that no polynomial approximation scheme exists. Garey and Johnson have 
given another graph composition technique to show that if performance guarantee 
p” is obtained, then this can be used to obtain a p-approximation algorithm. By 
repeatedly applying this technique, it is possible to convert any p-approximation 
algorithm, where p is a constant, into a polynomial approximation scheme. There 
are few other techniques that provide evidence for the intractability of computing 
near-optimal solutions. Recently, Papadimitriou and Yannakakis have proposed a 
complexity class, along with a notion of completeness, that attempts to character- 
ize those problems that have a constant performance guarantee, but do not have 
a polynomial approximation scheme. 

In the years since this survey was written, there have been dramatic advances 
in proving that certain problems are also hard to approximate. Feige, Goldwasser, 
Lovéasz, Safra, and Szegedy showed that unless NP C DTIME(n'*8!8"), there does 
exist @ p-approximation algorithm for maximum stable set problem for any constant 
p> 1. Arora and Safra strengthened this to show that achieving such an approxi- 
mation is NP-hard, and this was strengthed even further by Arora, Lund, Motwani, 
Sudan, and Szegedy, who proved that there exists an e > 0 such that there does 
not exist an n°-approximation algorithm unless P = NP. These techniques have 
yielded signficant results for other problems as well. Lund and Yannakakis proved 
a hardness-of-approximation result for the minimum node coloring problem iden- 
tical to the one stated above for the maximum stable set problem. Arora, Lund, 
Motwani, Sudan, and Szegedy also showed that, unless P = NY, there does not 
exist a polynomial approximation scheme for any complete problem in the class 
MAX SNP proposed by Papadimitriou and Yannakakis. For example, a corollary 
of this result is that there does not exist a polynomial approximation scheme for 
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the traveling salesman problem with the triangle inequality, unless P = NY. For a 
survey of this research, the reader is referred to Johnson (1992) and Shmoys (1995). 


3.2. Probabilistic analysis of algorithms 


One justified criticism of complexity theory is that it focuses on worst-case possi- 
bilities, and this may in fact be unrepresentative of practical reality. In this section, 
we will briefly indicate some of the results that are concerned with the proba- 
bilistic analysis of algorithms, where inputs are selected according to a specified 
probability distribution, and the average behavior is studied. Many related results 
are presented in chapter 6, and the reader is referred there, as well as to the sur- 
veys of Karp, Lenstra, McDiarmid and Rinnooy Kan (1985) and Coffman, Lueker 
and Rinnooy Kan (1988). We shall also sketch the main ideas of an analog of 
N P-completeness, recently proposed by Levin (1986), to provide evidence that a 
problem is hard to solve in even a probabilistic sense. 

For all of the problems mentioned in the subsection on the worst-case analysis of 
heuristics, it is possible to obtain much more optimistic results for the average-case 
analysis under the assumption that the input is drawn from a specified probability 
distribution. Unfortunately, these results rely heavily on the particular distribution 
selected, and an approach to the average-case analysis of algorithms that is insen- 
sitive to this, would be an important advance. As an example, for the traveling 
salesman problem with edge lengths that are independently and identically dis- 
tributed (i.i.d.) uniformly over the interval [0,1], Karp has given a heuristic where 
the expected value of its relative error is O(n '/*). On the other hand, the nodes 
may be selected i.i.d. uniformly in the unit square (0, 1]*, and the length of edge is 
given by the Euclidean distance between its endpoints. In this case, Karp (1976) 
has given a different algorithm that has the stronger property that the relative error 
converges to 0 almost surely as n — oo. This result, which stimulated much work 
in the area of probabilistic analysis, is based in part on a result of Beardwood, 
Halton and Hammersley, which proves that, for instances selected as above, there 
exists a constant c such that OPT(/)//n — c almost surely. 

Similar results are also known for such problems as node coloring and the hamil- 
tonian circuit problem. This work grew out of the theory of random graphs of 
Erdés and Rényi, which is treated in chapter 6. A common way to choose a ran- 
dom graph is to include each possible edge independently with probability 4 For 
the first problem, it is possible to prove that a simple greedy method is, in probabil- 
ity, a factor of 2 more than optimal. For the latter, it is possible to give an algorithm 
that always gives a correct answer and runs in expected polynomial time. 

Although the probabilistic analysis of algorithms has focused mainly on VY- 
‘complete problems, it has often served as a useful tool to show that the average- 
case running time of certain algorithms is much better than the worst-case running 
time. The most important example of this is the simplex method for linear program- 
ming, which is a practically efficient algorithm that was shown to have exponential 
worst-case running time by Klee and Minty. Borgwardt and Smale, independently 
showed that variants of the simplex method take polynomial expected time under 
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certain probabilistic assumptions. For a thorough survey of results in this area, the 
reader is referred to Shamir (1987). 

Not all problems in NY have been solved efficiently even with probabilistic 
assumptions, and only the simplest sorts of distributions have been analyzed. This 
raises the specter of intractability: are there distributions for which certain VP- 
complete problems remain hard to sotve? Levin (1986) has proposed a notion of 
completeness in a probabilistic setting. Once again, evidence for hardness is given 
by showing that if a particular problem in WY with a specified input distribution 
can be solved in expected polynomial time, then every such problem and distri- 
bution pair can also be solved so efficiently. For a more complete discussion, the 
reader is encouraged to read the column of Johnson (1984). One of the motivations 
for studying such truly intractable problems come from the area of cryptography, 
which attempts to use the intractability of a problem to the algorithm designer’s 
advantage (sec chapter 40). 

A bit of care must be given to formulating the precise notion of polynomial 
expected time, so that it is insensitive to both the particular choice of machine and 
to the encoding. If w(x) denotes the probability that a randomly selected instance 
of size n is x, and T(x) is the running time on x, then one would typically define 
expected polynomial time to require that the sum of y(x)T (x) over all instances 
of size n is O(n*) for some constant &. Instead, we consider p to be the density 
function over the set of all instances /, and require that 7, ., u(x) T(x)'/*/|x| < 00 
for some constant k. Levin’s notion of random NY requires that the distribution 
function M(x) = 577_, ui) can be computed in Y, where each instance is viewed 
as a natural number. This notion does not secm to be too restrictive, and includes 
all of the probability distributions discussed here. It only remains to define the 
notion of reducibility. As usual, “yes” instances must be mapped by a polynomial- 
time function f to “yes” instances, and analogously for “no” instances, but one 
must consider the density functions as well. The pair (L1, 4;) reduces to (L2, 42) 
if in addition we require that po(x) is at least a polynomial fraction of the total 
probability of elements that are mapped to x by f. 

Levin showed that all of random VY reduces to a certain random tiling problem. 
Instances consist of integers k < N, a set of tile types, each of which is a unit square 
labeled with letters in its four corners, and a side-by-side sequence of k such tiles 
where consecutive tiles have matching labels in both adjoining corners. We wish 
to decide if there is a way of extending the sequence to fill out an N x N square, 
where adjacent tiles have matching labels in their corresponding corners, and the 
top row starts with the specified sequence of tiles. The instances are selected by 
first randomly choosing a value for N, where N is set equal to n with probability 
proportional to n-?, n = 1,2,... k is chosen uniformly between 1 and N, the tile 
types are chosen uniformly, and then the tiles in the sequence are selected in order 
uniformly among all possible extensions of the current sequence. More recently, 
Venkatesan and Levin have shown that a generalization of the problem of edge- 
coloring digraphs (where for certain subgraphs, the number of edges given each 
color is specified) is also random WN ¥-complete. 
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4. Inside 


In this section we shall focus on the complexity of problems in Y. After proving 
that a problem is in #, the most important next step is undoubtedly to find an 
algorithm that is truly efficient, and not merely efficient in this theoretical sense. 
However, we will not address that issue, since it is best deferred to the individual 
chapters that discuss polynomial-time algorithms for particular problems. In this 
section, we shall address questions that relate to machine-independent complexity 
classes inside P. 

From a practical viewpoint, the most appealing complexity class inside is the 
class of languages for which some polynomial-time algorithms can be speeded up 
significantly if several processors work simultaneously. We shall discuss parallel 
computation, and focus on the complexity class Wc, which serves as a theoretical 
model of efficient parallel computability, much as ¥ serves as a theoretical model 
of efficient “sequential” computation. 

We shall also consider the space complexity of problems in &. Recall that the 
parallel computation thesis suggests that there is a close relationship between se- 
quential space complexity and parallel time complexity. We will show another 
proven case of this thesis: every problem in £, the class of problems solvable using 
logarithmic space, can be solved extremely efficiently in parallel. This is one source 
of interest in the complexity class ¢. Another source is that this complexity class is 
the basis for the natural reduction that helps to distinguish between the problems 
in Y in order to provide a notion of a hardest problem in Y. 


4.1. Logarithmic space 


The most general restriction on the space complexity of a language L that is known 
to imply that L € # is logarithmic space. Observe that ¢ C Nc C #, where the last 
inclusion follows, for example, from Theorem 2.9. The typical use of logarithmic 
space is to store a constant number of pointers, e.g., the names of a constant 
number of nodes in the input graph, and in some sense, this restriction attempts 
to characterize such algorithms. Although c¢ contains many interesting examples, 
we sce the role of logarithmic-space computation more as a natural means of 
reduction rather then providing interesting algorithms. Instead, we will focus on 
the nondeterministic and randomized analogs, for which there are languages that 
appear to be best characterized in terms of their space complexity. 

To define the notion of a logarithmic-space reduction, we introduce a variant of 
logarithmic-space computation that can produce output of superlogarithmic size. 
We say that a function f can be computed in logarithmic space if there exists a 
Turing machine with a read-only input tape and a write-only output tape that, on 
input x, halts with f(x) written on its output tape and uses at most logarithmic 
space on its work tapes. A problem L, reduces in logarithmic space to Ly if there 
exists a function f computable in logarithmic space that maps instances of L 
into instances of L2 such that x is a “yes” instance of L, if and only if f(x) is a 
“yes” instance of Lz. Let L1cXjogf.2 denote such a logarithmic-space reduction (or 
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log-space reduction, for short). It is fairly easy to show that the oqop relation is 
transitive. A problem L, is WC-complete if Ly € Nc and for all L € NC, LoxjogLy. 
The transitivity of cxtog yields the following result. 


Theorem 4.1. For any Nc-complete problem L, L € c if and only if ¢ = NC. 


Savitch provided the most natural example of an Nc-complete language: the 
directed graph reachability problem. The problem is clearly in Wc and can be 
shown to be W£-complete along the same lines as the PsPAcé-completeness of the 
circuit-based directed reachability problem. 


Theorem 4.2. The directed-graph reachability problem is Nc-complete. 


In view of the above result, it was very surprising when Aleliunas, Karp, Lipton, 
Lovasz and Rackoff (1979) showed that the undirected-graph reachability problem, 
the analog of the directed-graph reachability problem for undirected graphs, can 
be solved by a randomized Turing machine using logarithmic space. The algorithm 
attempts to find the required path by following a random walk in the graph, starting 
from the node s. It can be shown that a random walk is expected to use every edge 
with the same frequency, and if s and ¢ are in the same connected component then 
the walk is expected to reach ¢ in at most O(nm) steps, where n and m denote the 
numbers of nodes and edges. We define the class R£ to be the log-space analog 
of RP. A language L is in R¢ if there exists a randomized Turing machine RM 
that works in logarithmic space, such that each input x that RM accepts along any 
computation path is in 1. and for every x € L, the probability that RM accepts x 
is at least $. 


Theorem 4.3. Undirected-graph reachability is in RC. 


Recently, another piece of evidence has been discovered that suggests that undi- 
rected graph reachability is easier than its directed analog. Nisan, Szemerédi, and 
Wigderson proved that the undirected graph reachability problem can be solved in 
DSPACE(log!* n). 

One can think of the classes ¢ and Wc as lower-level analog of P and NP. 
By studying the relationships of c, Wc and co-Nc, one hopes to better understand 
the relationship of deterministic and nondeterministic computation. It is this point 
of view that makes the following theorem of Immerman (1988) and Szelepcsényi 
(1987) one of the biggest surprises in recent developments in complexity theory. 


Theorem 4.4, Ac = co-NC. 


The proof uses a definition of nondeterministically computing a function. We 
say that a function f(x) can be computed in nondeterministic logarithmic space if 
there is a nondeterministic log-space Turing machine that, on input x, outputs the 
valuc f(x) on at least one branch of the computation and on every other branch 
either stops without an output or also outputs f(x). If f(x) is a Boolean function, 
then we say that the language L defined by f(x) = 1 is decided in nondeterministic 
logarithmic space, which is equivalent to L being in N£Mco-NC. 
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We will prove Theorem 4.4 by showing that the A’c-complete directed-graph 
reachability problem can be decided in nondeterministic logarithmic space. Given 
a directed graph G = (V,A), asource node s and an integer k, let f(G,s,k) denote 
the number of nodes reachable from the node s along paths of length at most k. 


Lemma 4.5. The directed-graph reachability problem is decidable in nondetermin- 


istic logarithmic space if and only if the function f(G,s,k) can be computed in 
nondeterministic logarithmic space. 


Proof. To prove the only if direction, we use the fact that the directed graph 
reachability is Wc-complete. tf it is decidable in logarithmic space, then so is the 
problem of recognizing if there is a path of length at most k. To compute f(G,s,k), 
we use the assumed nondeterministic machine for each node v, to decide if v is 
reachable from s by a path of length at most k, and count the number of reachable 
nodes. 

To prove the opposite direction, we use the following nondeterministic log-space 
computation. First compute f(G, s,m). Then for each node v, (nondeterministically) 
try to guess a path from s to v. Count the number of nodes for which a path has 
been found. If a path has been found to 1, we accept the input. If f(G,s,n) nodes 
have been reached without finding a path to ¢, this proves that ¢ is not reachable 
from s, so we reject. Finally, if the number of nodes that have been reached is 
less than f(G,s,m), then this is an incorrect branch of the computation, and the 
computation stops without producing an output. Oo 


To finish the proof of Theorem 4.4 we have to argue that the function f(G,s,k) 
can be computed in nondeterministic logarithmic space. This is done by induction 
on k. Given f(G,s,k) we can decide if there exists a path of length k+1 from s 
to a particular node v by checking if there is a path of length at most k to any 
of the predecessors of v using a variant of the algorithm given in the if part of 
Lemma (4.5). Counting these nodes gives [(G,s,k + 1). 


4.2. The hardest problems in P 


One important application of log-space computation was introduced by Cook, 
who used log-space reducibility to introduce a notion of a hardest problem in F. 
A problem L, is P-complete if L, € P and for all L € P, Lextogh 1. The transitivity 
of the log-space reduction gives the following theorem. 


Theorem 4.6. For any P-complete problem L, L € c£ if and only if £ = Pf. 


Later in this section we shall see that ?-completeness also provides evidence that 
a problem cannot be efficiently solved in parallel. This fact has greatly increased 
the interest in P-completeness and a variety of problems have been shown to be 
#-complete. Perhaps the most natural example is the circuit value problem, which 
was proved ?-complete by Ladner. Given a circuit with truth values assigned to 
its input gates, the circuit value problem is to compute the output of the circuit. 
This problem is clearly in . It can be proved P-complete by building a circuit 
that simulates the computation of a Turing machine. 
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Theorem 4.7. The circuit value problem is P-complete. 


Dobkin, Lipton and Reiss proved that each problem in P log-space reduces to 
the linear programming problem, and the celebrated result of Khachiyan showed 
that it is in #. Valiant gave a straightforward reduction from a restricted circuit 
value problem that uscs linear constraints to trace the value computed by the 
circuit. 

Goldschlager, Shaw and Staples showed that the maximum flow problem, an 
important special case of the linear programming problem, is also P-complete. In 
this problem, we are given a directed graph G = (V, A) with two specified nodes, 
the source s and the sink ¢, and a non-negative capacity u(a) assigned to each 
arc a € A. A feasible flow is a vector f € R“ that satisfies the capacity constraints, 
i.e., O < f(a) < u(a) for each arc a € A, and the flow-conservation constraints, i.e., 
the sum of the flow values on the arcs incident to a node v #5,f is the same 
as the sum of the flow values on the arcs incident from v. The value of a flow 
iS ra-wayea F(@) — Va -awyca f(a). The decision problem that is proved to be P- 
complete is that of deciding the parity of the maximum flow value. 

There is a collection of #-complete problems that are related to simple 
polynomial-time algorithms. The maximal stable set problem, in which the ob- 
jective is to find a maximal (not maximum) stable set in an undirected graph, can 
clearly be solved by a simple greedy algorithm. When using this procedure, we usu- 
ally select the first available node in each step, and so we find a specific solution, 
the lexicographically first one. Cook showed that finding the lexicographically first 
maximal stable set is ?-complete. This result might be surprising since this prob- 
lem is easy to solve in polynomial time. However, -completeness also provides 
evidence that the problem is not solvable efficiently in parallel. Consequently, this 
completeness result supports the intuition that the greedy algorithm is inherently 
sequential. 


4.3. Parallel computation 


Parallel computation gives us the potential of substantially increasing the size of 
the instances for which certain problems are manageable by solving them with a 
large number of processors simultaneously. In studying parallel algorithms, we shall 
not be concerned with the precise number of parallel processors used, but rather 
their order as a function of the input size. We say that a parallel algorithm us- 
ing O(p(n)) processors achieves optimal speedup, if it runs in O(t(n)) time and the 
best sequential algorithm known for solving the same problem runs in O(t(”)p(71)) 
time. Efficient algorithms that reach optimal (or near-optimal) speedup with a sig- 
nificant number of processors have been found for many of the basic combinatorial 
problems. Another aspect of paralle! computation is concerned with the inherent 
limitations of using many processors to speed up a computation. To be somewhat 
realistic, we shall only be interested in algorithms that use a polynomial number 
of processors. Consequently, we will focus on the possible speedup of polynomial- 
time sequential computation by parallel processing. 

First we define a model of parallel computation. Although many such models 
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have been proposed, one that seems to be the most convenient for designing algo- 
rithms is the parallel random access machine (PRAM). The PRAM is the parallel 
analog of the random access machine; it consists of a sequence of random access 
machines called processors, each with its own infinite local random access memory, . 
in addition to an infinite shared random access memory where each memory cell 
can store any integer, and the input is stored in the shared memory. Each pro- 
cessor knows the input size and its identification number, although otherwise the 
processors are identical (i.e., they run the same program). Different variants of the 
basic PRAM model are distinguished by the manner in which they handle read 
and write conflicts. In an EFREW PRAM (exclusive-read, exclusive-write PRAM), 
for example, it is assumed that each cell of the shared memory is only read from 
and written into by at most one processor at a time. At the other extreme, in a 
CRCW PRAM (concurrent-read, concurrent-write PRAM), each cell of the mem- 
ory can be read from and written into by more than one processor at a time. If 
different processors attempt to write diffcrent things in the same cell, then the 
lowest-numbered processor succeeds. 

To illustrate the power of parallel computation, we give parallel algorithms for a 
problem that we have already discussed. Although finding the lexicographically first 
maximal stable set is P-complete, Karp and Wigderson have proved, surprisingly, 
that a maximal stable set can be found efficiently in parallel. Similar, much simpler 
and more efficient randomized algorithms have subsequently been independently 
discovered by Luby and by Alon, Babai and Itai. 

Consider the most natural sequential algorithm for the following problem: select 
a node vu and include it in the stable sct, delete v and all of its neighbors from 
the graph; repeat this procedure until all nodes have been deleted. Note that this 
algorithm requires n iterations for a path of length 2n. A similar approach can still 
be used for a parallel algorithm. To make the algorithm fast, one selects a stable 
set in each iteration (rather than a single node), where the set is deleted along 
with its neighborhood. The following simple way to choose this stable set is due 
to Luby. A processor is assigned to each node and each edge of the graph. For 
a graph of order , the processor assigned to node v picks a random integer c(v) 
from 1 to n*. Next, each processor assigned to an edge compares the values at the 
two nodes of the edge. The stable set selected consists of those nodes v for which 
c(v) is strictly larger than the values assigned to any of its neighbors. 

This algorithm clearly finds a maximal stable set, but it is less clear that few 
iterations are needed. It can be shown that each iteration is expected to remove a 
constant fraction of the edges, and consequently, the expected number of iterations 
is O(logn). The algorithm can be implemented on a randomized CRCW PRAM 
in O(log”) time (if we assume that a processor can choose a random number of 
size O(log) in one step). 

Karp and Wigderson introduced a technique that can be used to convert certain 
randomized algorithms into deterministic ones. The technique can be used if, in 
the analysis of the randomized algorithm, it is not necessary to assume mutual 
independence, but, for example, d-wise independent choices suffice for some con- 
stant d. One can appeal to known constructions to show that such variables can 
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be chosen from a sample space of polynomial size. Each iteration can then be run 
for each point in the sample space simultaneously, and this ensures that a good 
sample point is used. This method can be used to convert the above randomized 
algorithm into a deterministic one. 

When discussing parallel algorithms we shall assume that all arithmetic oper- 
ations are restricted to polynomial-size numbers, and the number of processors 
used is polynomially bounded. We define the class Nc to consist of all languages 
L for which there exists a parallel algorithm that runs in time bounded by a poly- 
nomial in logn. Note that in this definition the distinction between the different 
versions of the basic PRAM model are not relevant. If a problem can be solved by 
a CRCW PRAM using p() processors in O(log! n) time, then it can be solved by 
an EREW PRAM using p(n) processors and O(log! nlog p(n)) time. The maximal 
stable set algorithm of Luby discussed earlier uses a randomized version of the 
CRCW PRAM. We define the complexity class RNC to be the A analog of RP. 

It is straightforward to see that the Boolean product of two n x n (0,1)-matrices 
can be computed in constant time on a CRCW PRAM using O(n) processors. 
By repeatedly squaring the adjacency matrix of a graph, the directed reachability 
problem can be solved in O(log) time. This is, in some sense, the generic problem 
in VC, and more gencrally, any problem in Wc can be solved by a CRCW PRAM in 
O(log) time. As a consequence, a log-space reduction can be simulated efficiently 
in parallel, and therefore P-completeness provides evidence that a problem is not 
efficiently solvable in parallel. 


Theorem 4.8. For any P-complete problem L, L € Ne if and only if NC=P 
We get the following chain of inclusions: 


LONE CNCCPCNP C PSPACE. 


On the other hand, the computation of a CRCW PRAM that runs in O(log’ n) 
lime can be simulated by a Turing machine in O(log'*! n) space. This proves that 
Ne is contained in \),., DSPACE(log' n). By the analog of Theorem 2.12 for space 
complexity, this implies that Wc 4 PSPACE. ; 

We have already seen that a simple parallel algorithm for the directed reacha- 
bility problem is based on matrix multiplication, and in fact, many simple parallel 
graph algorithms are based on matrix operations. Csanky has given an NC Weouan 
to compute the rank and the determinant of a matrix over the reals in O(log” n) 
time. As a corollary, we get a parallel algorithm to solve systems of linear equa- 
tions. Berkowitz, Chistov and Mulmulcy have extended these results to matrices 
over arbitrary fields. One of the most beautiful connections between matrix op- 
erations and graph algorithms is that a perfect matching in a graph can be found 
by an efficient randomized parallel algorithm using only a single matrix inversion 
(see chapter 3). 

There has been substantial work over the past several years in finding effi- 
cient parallel algorithms for combinatorial problems. Some of these algorithms 
are mentioned elsewhere in this Handbook. For further results and more details 
the interested reader is referred to the survey of Karp and Ramachandran (1990). 
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1. Introduction 


Polyhedral combinatorics studies combinatorial problems with the help of 
polyhedra. Let us first give a simple, illustrative example. Let G=(V, E) be a 
graph, and let c: E->R, be a weight function on the edges of G. Suppose we 
want to find a matching M in G with “weight” 


c(M) = >) ce) (1.1) 


eEM 
as large as possible, Thus we want to ‘‘solve” 
max{c(M)|M matching in G} . (1.2) 


Denote for any matching M, the incidence vector of M in R® by y™, ie. 
x“(e):=1 if eEM and :=0 if e€ M. Considering the weight function c: E> R 
as a vector in R*, we can write problem (1.2) as 


max{c'y“ | M matching in G} . (1.3) 


This amounts to maximizing a linear function over a finite set of vectors. Hence 
we can equally well maximize over the convex hull of these vectors: 


max{c'x |x € conv{y” |M matching in G}} . (1.4) 
The set 
conv{ | M matching in G} (1.5) 


is a polytope in R“, called the matching polytope of G. It follows that there exist 
a matrix A and a vector b such that 


conv{x”|M matching in G} = {x ER* |x 20, Ax <b}. (1.6) 
Then problem (1.4) is equal to 
max{c'x|x=0, Ax <b}. (1.7) 


In this way we have formulated the original combinatorial problem (1.2) as a 
linear programming problem. This enables us to apply linear programming 
methods to study the original problem. 

The problem at this point is, however, how to find the matrix A and the vector 
b. We know that A and b exist, but we must know them in order to apply linear 
programming methods. 

If G is bipartite, it turns out that the matching polytope of G is equal to the set 
of all vectors x ER satisfying 


x(e)20, e€E 
>. x(e)=1, vEV. 


e€D3o0 


(1.8) 
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That is, for A we can take the V x E incidence matrix of G and for b the all-one 
vector 1 in R”. 

It is not difficult to show that the matching polytope for bipartite graphs is 
indeed completcly determined by (1.8). First note that the matching polytope is 
contained in the polytope defined by (1.8), since y™ satisfies (1.8) for each 
matching M. To see the converse inclusion, we note that if G is bipartite, then the 
matrix A is totally unimodular, i.e., cach square submatrix of A has determinant 
belonging to {0, +1, —1}. This may be seen to imply that the vertices of the 
polytope determined by (1.8) are integral vectors, i.e., they belong to Z”. Now 
cach integral vector satisfying (1.8) must trivially be equal to x” for some 
matching M. Hence the polytope determined by (1.8) is equal to the matching 
polytope of G. 

For each nonbipartite graph, the matching polytope is not completely de- 
termined by (1.8). Indecd, if C is an odd circuit in G, then the vector x ER” 
defined by x(e) = 4 ife EC and 0 if eC, satisfies (1.8) but does not belong to 
the matching polytope. 

In fact, it is a pioneering theorem in polyhedral combinatorics due to J. 
Edmonds that gives a complete description of the inequalities needed to describe 
the matching polytope for arbitrary graphs. 

When we have formulated the matching problem as LP problem (1.7), we can 
apply LP techniques to study the matching problem. Thus we can find a maximum 
weighted matching in a bipartite graph, ¢.g., with the simplex method. Moreover, 
the Duality Theorem of Linear Programming gives 


max{c(M)|M matching in G} = max{c'x|x 20, Ax <1} 
=min{y"1|y20, y'AzSc'). (1.9) 


In the special case of G bipartite and c being the all-one vector in R’, we can 
derive from this the Kénig—Egervary Theorem. The left-most expression in (1.9) 
is equal to the maximum size of a matching. The minimum can be seen to be 
attained by an integral vector y, again by the total unimodularity of A. This y is a 
{0, 1}-vector in R", and is the incidence vector of some subset W of V intersecting 
every edge of G. Thus (1.9) implics that the maximum size of a matching is equal 
to the minimum size of a set of vertices intersecting all edges of G. 

As an extension, one can derive the Tutte—Berge Formula from the inequality 
system given by Edmonds for arbitrary graphs. 

Bipartite matching forms an easy example in polyhedral combinatorics. We now 
discuss the central idea of polyhedral combinatorics — taking convex hulls — in a 
more general framework. 

Let F be a collection of subsets of a finite set S, let c: SR, and suppose we 
are interested in 


max{ 5) oiues}. (1.10) 


sEU 


(For example, S is the set of edges of a graph, and ¥ is the collection of 
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matchings, in which case (1.10) is the maximum “weight” of a matching.) 
Usually, ¥ is too large to evaluate each set U in & in order to determine the 
maximum (1.10). (For example, the collection of matchings is exponentially large 
in the size of the graph.) Now (1.10) is equal to : 


max{c'y” |U EF}, (1.11) 


where y” denotes the incidence vector of U in R*, i.e., y"(s) =1 if se U and 0 
otherwise. [Here we identify functions c: SR with vectors in the linear space 
R*, and accordingly we shall sometimes denote c(s) by c,.] Since (1.11) means 
maximizing a linear function over a finite set of vectors, we could equally well 
maximize over the convex hull of these vectors: 


max{c'x |x Econv{y” |U € F}}. (1.12) 


Since this convex hull is a polytope, there exist a matrix A and a column vector b 
such that 


conv{y” |UE ¥} = {xER*| Ax <b}. (1.13) 
Hence (1.12) is equal to 
max{c'x| Ax <b}. (1.14) 


Thus we have formulated the original combinatorial problem as a linear program- 
ming problem, and we can appcal to linear programming methods to study the 
combinatorial problem. 

In order to determine the maximum (1.10) algorithmically, we could use LP 
algorithms like the simplex method or the primal—dual method. Sometimes, with 
the ellipsoid method the polynomial-time solvability of (1.10) can be shown. 
Moreover, by the Duality Theorem of Linear Programming, problem (1.14), and 
hence problem (1.10), is equal to 


min{y' bl y=0,y'A=c'), (1.15) 


which gives us a min-max equation for the combinatorial maximum. Often this 
provides us with a ‘“‘good characterization” [i.e., problem (1.10) belongs to 
NPM co-NP], and it enables us to carry out a “sensitivity analysis’’ of the 
combinatorial problem, etc. 

However, in order to apply LP techniques, we should be able to find matrix A 
and vector b satisfying (1.13). This is one of the main theoretical problems in 
polyhedral combinatorics. 

Often, one first “‘guesses” a system Ax <b, and next, one tries to prove that 
Ax <b forms a complete description of the polytope. Sometimes, like in bipartite 
matching, this can be shown with the help of the total unimodularity of A. 
However, in general A is not totally unimodular, and one has to try more 
complicated techniques to show that Ax <b completely describes the polytope. In 
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this survey, we mention the techniques of “total dual integrality”, “blocking 
polyhedra”’, “‘anti-blocking polyhedra’’, and ‘‘cutting planes”’. 

In several cases, the guessed system Ax <b turns out not to be a complete 
description, but just gives an approximation of the polytope. This can still be 
useful, since in that case the linear programming problem max{c'x | Ax <b} gives 
a (hopefully good) upper bound for the combinatorial maximum. This can be 
very useful in a so-called branch-and-bound algorithm for the combinatorial 
problem. 

Historically, applying LP techniques to combinatorial problems came along 
with the introduction of linear programming in the 1940s and 1950s. Dantzig, 
Ford, Fulkerson, Hoffman, Johnson and Kruskal studied problems like the 
transportation, flow, and assignment problems, which can be reduced to linear 
programming (by the total unimodularity of the constraint matrix), and the 
traveling salesman problem, using a rudimentary version of a cutting plane 
technique (extended by Gomory to general integer linear programming). 

The field of polyhedral combinatorics was extended and deepened considerably 
by the work of Edmonds in the 1960s and 1970s. He characterized basic polytopes 
like the matching polytope, the arborescence polytope, and the matroid intersec- 
tion polytope; he introduced (with Giles) the important concept of total dual 
integrality; and he advocated the link between polyhedra, min-max relations, 
good characterizations, and polynomial-time solvability. Fulkerson designed the 
clarifying framework of blocking and anti-blocking polyhedra, enabling the 
deduction of one polyhedral characterization or min—max relation from another. 

In this chapter we describe the basic techniques in polyhedral combinatorics, 
and we derive as illustrations polyhedral characterizations for some concrete 
combinatorial problems. First, in sections 2 and 3, we give some background 
information on polyhedra and linear programming methods. 

For background and related literature we refer to Grétschel et al. (1988), 
Grétschel and Padberg (1985), Griinbaum (1967), Lovasz (1977, 1979), Pul- 
leyblank (1983), Schrijver (1983b, 1986), and Stoer and Witzgall (1970). 


2. Background information on polyhedra 


For an in-depth survey on polyhedra (focusing on the combinatorial properties) 
we refer the reader to chapter 18. In this section, we give a bricf review on 
polyhedra, covering those parts of polyhedral theory required for the present 
chapter. 

A set PCR?” is called a polyhedron if there exist a matrix A and a column 
vector b such that 


P={x|Ax<b}. (2.1) 


If (2.1) holds, we say that Ax = b determines P. A set P CIR" is called a polytope 
if there exist x,,...,x, in R” such that P =conv{x,,...,x«,}. The following 
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theorem is intuitively clear, but is not trivial to prove, and is usually attributed to 
Minkowski (1896), Steinitz (1916), and Weyl (1935). 


Finite Basis Theorem for sere 2.2. A set P is a se ae if and only if P is a 
bounded polyhedron, 


Motzkin, in 1936, extended this to: 


Decomposition Theorem for Polyhedra 2.3. PCR" is a polyhedron if and only if 
there exist X,,...,%X,, Yj,--+,¥, ER" such that 


PS {inp tits Ae py ee Fey Ae Abies i Oh 
Apter tA, =]. 


Now let P= {x| Ax <b} bea ee polyhedron, where A has order m X n. 
If cE R" with c 40 and 6 = max{c'x|x € P}, then the set {x|c'x =6} is called a 
supporting hyperplane of P. A subset F of P is called a face of P if F = P or if 
F=PQH for some supporting hyperplane ff of P. Clearly, a face of P is a 
polyhedron again. It can be shown that for any face F of P there exists a 
subsystem A’x <b’ of Ax <b such that F = {x € P| A'x = b’}. Hence P has only 
finitely many faces. They are ordered by inclusion. Minimal faces are the faces 
minimal with respect to inclusion. The following theorem is due to Hoffman and 
Kruskal (1956). 


Theorem 2.4. A set F is a minimal face of P if and only if 6A FCP and 
F={x|A'x=b'} 


for some subsystem A'x <b’ of Ax <b. 


All minimal faces have the same dimension, viz. n-rank(A). If this is 0, minimal 
faces correspond to vertices: a vertex of P is an element of P which is not a 
convex combination of two other elements of P. Only if rank(A) =n, does P have 
vertices, and then those vertices are exactly the minimal faces. Hence: 


Theorem 2.5. Vector z in P is a vertex of P if and only if A'z=b' for some 
subsystem A'x <b' of Ax <b, with A' nonsingular of order n. 


The matrix A’ (or subsystem A’x <b’) is sometimes called a basis for z. 
Generally, such a basis is not unique. P is called pointed if it has vertices. A 
polytope is always pointed, and is the convex hull of its vertices. 

Two vertices x and y of P are adjacent if conv{x, y} is a face of P. It can be 
shown that if P is a polytope, then two vertices x and y are adjacent if and only if 
the vector §(x + y) is not a convex combination of other vertices of P. Moreover, 
one can show: 
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Theorem 2.6. Vertices z' and z" of the polyhedron P are adjacent uf and only if z' 
and 2" have bases A'x <b' and A"x <b", respectively, so that they have exactly 
n—1 constraints in common. 


The polyhedron P gives rise to a graph, whose nodes are the vertices of P, two 
of them being adjacent in the graph if and only if they are adjacent on P. The 
diameter of P is the diameter of this graph. The following conjecture is due to W. 
M. Hirsch (cf. Dantzig 1963). 


Hirsch’s Conjecture 2.7. A polytope in R” determined by m inequalities has 
diameter at most m —n. 


This conjecture is related to the number of iterations in the simplex method 
(see section 3). See also Klee and Walkup (1967) and Larman (1970). [The 
Hirsch conjecture was proved by Naddef (1989) for polytopes all of whose 
vertices are {0, 1} — vectors.] 

A facet of P is an inclusion-wise maximal face F of P with F # P. A face F of P 
is a facet if and only if dim(F) = dim(P) — 1. An incquality c'x <6 is called a 
facet-inducing inequality if PC {x|c'x <8} and PM {x|c'x =8} is a facet of P. 

Suppose Ax <b is an irredundant (or minimal) system determining P, i.e., no 
inequality in Ax <b is implied by the other. Let A*x <b" be those inequalities 
a'x<B from Ax <b for which a'z<f for at least one z in P. Then each 
inequality in A’x<b' is a facet-inducing inequality. Moreover, this defines a 
one-to-one relation between facets and inequalities in A*x <b”. If P is full- 
dimensional, then the irredundant system Ax <b is unique up to multiplication of 
inequalities by positive scalars. The following characterization holds. 


Theorem 2.8. If P = {x | Ax <b} is full-dimensional, then Ax <b is irredundant if 
and only if for each pair a; x <b, and a; x <b, of constraints from Ax <b there is a 
vector x’ in P satisfying a} x’ = b, and aj x” <6, 

The polyhedron P is called rational if we can take A and b in (2.1) rational- 
valued (and hence we can take them integer-valued). P is rational if and only if 
the vectors x,,...,x,, and y,,...,y, in Theorem 2.3 can be taken to be 
rational. P is called integral if we can take x,,...,x,, and y,,..., y, in Theorem 
2.3 integer-valued. Hence P is integral if and only if P is the convex hull of the 
integer vectors in P or, equivalently, if and only if every minimal face of P 
contains integer vectors. 


3. Background information on linear programming 


Linear programming, abbreviated by LP, studies the problem of maximizing or 
minimizing a linear function c'x over a polyhedron P. Examples of such a 


carat a 


Polyhedral combinatorics 1657 


problem are: 
(i) max{e'x| Ax <b}, 
(ii) max{c'x|x20, Ax <b}, 
(iii) max{c'x|x 20, Ax =b}, 


(iv) min{c'x|x 20, Ax=b}. 


(3.1) 


It can be shown, for each of the problems (1)—(iv), that if the set involved is a 
polyhedron with vertices [for (ii)—(iv) this follows if it is nonempty], and if the 
optimum value is finite, then it is attained by a vertex of the polyhedron. 

Each of the optima (3.1) is equal to the optimum value in some other LP 
problem, called the dual problem. 


Duality Theorem of Linear Programming 3.2. Let A be an mt Xn matrix and let 
bER"” and cER". Then — 


(i) max{etx| Ax =b) =min{y'bly>0, y'A=c'}; 
(ii) max{e"x|x20, Ax <b} =min{y’bl|y20, y' Ac}; 
(iii) max{c'x|x20, Ax =b} =min{y'b| y'A=c'}; 


(iv) min{c'x|x20, Ax =b} =max{y'b| y=0, y'A<c'}; 


(3.3) 


provided that these sets are nonempty. 
It is not difficult to derive this from: 


Farkas’s Lemma 3.4. Let A be an m Xn matrix and let b ER”. Then Ax = b has 
a solution x =0 if and only if y'b 20 holds for each vector y ER” with y'A=0. 


The principle of complementary slackness says: let x and y satisfy Ax <b, 
y20, y'A =c' then x and y are optimum solutions in Theorem 3.2(i) if and only 
if y, =0 or a; x =b, for each i=1,...,m (where a}x = b, denotes the ith line in 
the system Ax = b). Similar statements hold for Theorem 3.2(ii)~(iv). 

We now describe briefly three of the methods for solving LP problems. The first 
two methods, the famous simplex method and the primal—dual method, can be 
considered also, when applied to combinatorial problems, as a guideline to 
deriving a “combinatorial” algorithm from a polyhedral characterization. The 
third method, the ellipsoid method, is more of theoretical value: it is a tool 
sometimes used to derive the polynomial-time solvability of a combinatorial 
problem. 


3.1. The simplex method 


The simplex method, due to Dantzig (195ia), is the method used most often for 
linear programming. Let AG R”™", BER", and cE R". Suppose we wish to 
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solve max{c'x | Ax <b}, where the polyhedron P:= {x | Ax <b} is a polyhedron 
with vertices, t.e., rank(A) =a. 

The idea of the simplex method is to make a trip, going from a vertex to a 
better adjacent vertex, until an optimal vertex is reached. By Theorem 2.5, 
vertices can be described by bases, while by Theorem 2.6 adjacency can be 
described by bases differing in exactly one constraint. Thus the process can be 
described by a series 


Aox < by, A,X <b,, A,X <b,,... : (3.5) 


of bases, where each x, := A, 'b, isa vertex. of P, where A,4,;% =5,,, differs by 
one constraint from A,x <= b,, and where c' pee =C "Ky: 

The serics can be found as follows. Suppose A,x <b, has been found. If 
c'A,' 0, then x, is an optimal solution of max fer | Ax <b}, since for each x 
satisfying Ax <b one has A,x <b, and hence c'x =(c'A,;')A,x <(c'A,')b, = 
C Xy. 

If c'A,' #0, choose an index i so that (c'A;'),<0, and let z:=—A,'e, 
(where e, denotes the ith unit basis vector in R”). Note that for A=0, x, + Az 
traverses an edge or ray of P G. e., face of dimension 1), or it is outside of P for 
all A >0. Morcover, c z=~c'A,'e,>0. Now if Az <0, then x, + Az € P for all 
A=0, whence max{c'x | Ax <b} =~. If Az XO, let A, be the largest A such that 
x, + Az belongs to P, i-e., 


jal... .mvajz =o}. (3.6) 


Choose an index j attaining this minimum. Replacing the ith inequality in 
A,x <b, by inequality a; xs b, then gives us the next system A, , x Saat 

Note “that Xpap =X 7 Ayz, implying that if Xe %X, then c'x,,,>c'x,. 
Clearly, the above process stops if c'x,,, >c'x, for cach k (since P has only 
finitely many vertices). This is the case if cach vertex has exactly one basis — the 
nondegenerate case. However, in general it can happen that x,,, =x, for certain 
k. Several “pivot selection rules”, prescribing the choice of i and j above, have 
been found which could be proved to yield termination of the simplex method. 
No onc of these rules could be proved to give a polynomial-time method — in fact, 
most of them could be shown to require an exponential number of iterations in 
the worst case. 

The number of iterations in the simplex method is related to the diameter of 
the underlying polyhedron P. Suppose P is a polytope. If there is a pivot selection 
rule such that for each cE R" the problem max{c'x| Ax <b} can be solved 
within ¢ iterations of the simplex method (starting with an arbitrary first basis 
AX <b, corresponding to a vertex), then clearly P has diamcter at most ¢. 
However, as Padberg and Rao (1974) showed, the “traveling-salesman poly- 
topes” (see section 10) form a class of polytopes of diameter at most 2, while 
maximizing a linear function over these polytopes is NP-complete. 

A main problem seems that we do not have a better criterion for adjacency 
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than Theorem 2.6. Note that a vertex of P can be adjacent to an exponential 
number of vertices (in the sizes of A and b), whereas for any basis A’ there are at 
most n(m — n) bases differing from A’ in exactly one row. In the degenerate case, 
there can be several bases corresponding to one and the same vertex. Just this — 
phenomenon shows up frequently in polytopes occurring in combinatorial 

optimization, and one of the main objectives is to find’ pivoting rules preventing us 


going through many bases corresponding to the same vertex (cf. Cunningham 
1979). 


3.2. Primal—dual method 


As a generalization of similar methods for network flow and transportation 
problems, Dantzig et al. (1956) designed the “primal—dual method” for LP. The 
general idea is as follows. Starting with a dual feasible solution y, the method 
searches for a primal feasible solution x satisfying the complementary slackness 
condition with respect to y. If such a primal feasible solution is found, x and y 
form a pair of optimal (primal and dual) solutions. If no such primal solution is 
found, the method prescribes a modification of y, after which we start anew. 

The problem now is how to find a primal feasible solution x satisfying the 
complementary slackness condition, and how to modify the dual solution y if no 
such primal! solution is found. For general LP problems this problem can be seen 
to amount to another LP problem, generally simpler than the original LP 
problem. To solve the simpler problem we could use any LP method, e.g., the 
simplex method. In many combinatorial applications, however, this simpler LP 
problem is a simpler combinatorial optimization problem, for which direct 
methods are available (see Papadimitriou and Steiglitz 1982). Thus, if we can 
describe a combinatorial optimization problem as a linear program, the primal- 
dual method gives us a scheme for reducing one combinatorial problem to an 
easier combinatorial problem. 

We shall now describe the primal—dual method more precisely. Suppose we 
wish to solve the LP problem. 


min{c'x|x=0, Ax = 5), (3.7) 


where A is an m Xn matrix, with columns a,,...,@,, bE R”, and cE R”. The 
dual problem is 


max{y'b|y'A<e'}. (3.8) 


The primal—dual method consists of repeating the following primal-—dual itera- 
tion. Suppose we have a feasible solution y, for problem (3.8). Let A’ be the 
submatrix of A consisting of those columns a, of A for which Yo; =c;. To find a 
feasible primal solution for which the complementary slackness condition holds, 
solve the restricted linear program 


min{A|x’, A=0; A'x’ + bA=b} = max{y'b| y"A' <0, yb <1}. (3.9) 
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If the optimum valuc is 0, let xj, A be an optimum solution for the minimum. So 
x, 20, A'x,=b, and A=0. Hence by adding zeTO- components, we obtain a 
vector x) 20 such that Ax,=b and (x,);=0 if Yaa, <c,. By complementary 
slackness, it follows that x, and y, are optimum solutions for problems (3.7) and 
(3.8). If the optimum value in problem (3.9) is positive, it is 1. Let u be an 
optimum solution for the maximum. Let @ be the largest real number satisfying 


(yo t+ Ou)'A<c'. ; (3.10) 


(Note that @ >0.) Reset y,:= y, + @u, and start the iteration anew. 

This describes the primal—dual method. It reduces problem (3.7) to (3.9), 
which is often an easier problem, consisting only of testing feasibility of: x' = 0, 

x'=b. 

The primal—dual method can equally be considered as a gradient method. 
Suppose we wish to solve problem (3.8), and we have a feasible solution y,. This 
Yq is not optimal if and only if we can find a wector u such that u'b > 0 and u isa 
feasible direction in Yo {i.e., (yo + ou)" Ax cr for some 6 > O]. If we let A’ consist 
of those columns of A in which y,A<c' has equality, then u is a feasible 
direction if and only if u' A’ <0. So u can be found by solving the right-hand side 
of problem (3.9). 


Application 3.11 (Maximum flow). Let D=(V, A) be a directed graph, let r, 
s€V, and let a “capacity” function c: AQ, be given. The maximum flow 
problem is to find the maximum amount of flow from r to s, subject to c: 


maximize oH x(a) — > x(a) . (3.12) 


a&s *(r) aes (r) 


subject to > x(a) — > x(aj=0, vEV,uver,s, 


ab '(v) aGd (v) 


O<x(a)<c(a), aEA. 


If we have a feasible solution x,, we have to find a feasible direction in xg, i.e., a 
function uw: A—R satisfying 


> ua)- Dd ula)>0, 


aes *(r) aes (r) 
> uay- >» u(a)=0, vEV, v¥r,s, 
ab *(v) as “(v) 


(3.13) 
u(a)=0, a€A, x,(a)=0, 


u(a)=0, a€A, x,(a)=c(a). 


One easily checks that this problem is equivalent to the problem of finding an 
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undirected path from r to s in D =(V, A) so that for any arc a in the path, 


if x,(@2)=0, then arc a is traversed forward, 
if x,(a) = c(a), then arc a is traversed backward, (3.14). 


if 0<x,(a) <c(a), then arc a is traversed forward or backward. 


If we have found such a path, we find w as in (3.13) (by taking u(a) = +1 or —1 if 
a occurs in the path forward or backward, respectively, and u(a) = 0 if a does not 
occur in the path). Taking the highest @ for which x, + 6u is feasible in problem 
(3.12) gives us the next feasible solution. The path is called a flow-augmenting 
path, since the new solution has a higher objective value than the old. Iterating 
this process we finally get an optimum flow. This is exactly Ford and Fulkerson’s 
algorithm (1957) for finding a maximum flow, which is therefore an example of a 
primal—dual method. {[Dinits (1970) and Edmonds and Karp (1972) showed that a 
version of this algorithm is a polynomial-time method.| 


3.3. The ellipsoid method 


The ellipsoid method, developed by Shor (1970a,b, 1977) and Yudin and 
Nemirovskii (1976/1977, 1977) for nonlinear programming, was shown by 
Khachiyan (1979) to solve linear programming in polynomial time. Very roughly 
speaking, it works as follows. 

Suppose we wish to solve the LP problem 


max{c'x|Ax <b}, (3.15) 


where A4€Q””"", bE Q”, and cE Q". Let us assume that the polyhedron 
P:= {x | Ax <b} is bounded. Then it is not difficult to calculate a number R such 
that PC {xER"||lx]]<R}. We construct a sequence of cllipsoids Ey, E,, 
E,,..., each containing the optimum solutions of problem (3.15). First, Ey := 
iy ER" | ||x|| = R}. Suppose ellipsoid E, has been found. Let z be its center. 

If Az <b does not hold, let a,x <b, be an inequality in Ax <b violated by Zz. 
Next let E,,, be the ellipsoid of smallest volume satisfying E,,, DE, {xlaix< 
a,z}. If Az <b does hold, Ict E,,, be the ellipsoid of smallest volume satisfying 

E,,,DE,N{x{e'x2=c"z}. 

One can prove that these ellipsoids of smallest volume are uniquc, and that the 
parameters Parry at E,,, can be expressed straightforwardly in those de- 
termining E, and in a,, respectively c. Moreover, vol(E,,,)<e7 '"" - vol{E,). 
Hence the volumes of the successive ellipsoids decrease exponentially fast. Since 
the optimum solutions of problem (3.15) belong to each E,, we may hope that the 
centers of the ellipsoids converge to an optimum solution of problem (3.15). 

To make this description more precise, an important problem to be solved is 
that ellipsoids with very small volume can still have a large diameter [so that the 
centers of the cllipsoids can remain far from any optimum solution of problem 
(3.15)]. Another, technical, problem is that the unique smallest elipsoid is usually 
determined by irrational parameters, so that if we work in rational arithmetic we 
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must allow approximations of the successive ellipsoids. These problems can be 
overcome, and a polynomially bounded running time can be proved. 

It was observed by Grotschel et al. (1981), Karp and Papadimitriou (1982) and 
Padberg and Rao (1980) that in applying the ellipsoid method, it is not necessary 
that the system Ax <b be explicitly given. It suffices to have a ‘‘subroutine” to 
decide whether or not a given vector z belongs to the feasible region of problem 
(3.15), and to find a separating hyperplane in case z is not feasible. This is 
especially useful for linear programs coming from combinatorial optimization 
problems, where the number of inequalities can be exponentially large (in the size 
of the underlying data-structure), but can yet be tested in polynomial time. 

This leads to the following result (Grétschel et al. 1981). Suppose we are given, 
for each graph G =(V, E), a collection &, of subsets of E. For example, 


(i) &, is the collection of matchings in G; 
(ii) &,; is the collection of spanning trees in G; (3.16) 


(iii) = 4%; is the collection of Hamiltonian circuits in G. 
With any class (¥,,|G graph), we can associate the following problem. 


Optimization Problem 3.17. Given a graph G = (V, E) and cE Q*, find FE &, 
maximizing Der Ce 

So if (4%, |G graph) is as in (i), (ii), and (iii) above, Problem 3.17 amounts to 
the problems of finding a maximum weighted matching, a maximum weighted 
spanning tree, and a maximum weighted Hamiltonian circuit (the traveling 
salesman problem), respectively. 

The optimization problem is called solvable in polynomial time, or polynomially 
solvable, if it is solvable by an algorithm whose running time is bounded by a 
polynomial in the input size of Problem 3.17, which is |V|+|E|+ size(c). Here 
size(c):= Vier size(c.), where the size of a rational number p/q is log,(({p| + 
1) + log,({q]). So size(c) is about the space needed to specify c in binary notation. 

Define also the following problem for any fixed class (¥,; |G graph). 


Separation Problem 3.18. Given a graph G=(V,E) and x€Q", determine 
whether or not x belongs to conviy’ |F € ¥%,}, and if not, find a separating 
hyperplane. 


Theorem 3.19. For any fixed class (¥,,|G graph), the Optimization Problem 3.17 


is polynomially solvable if and only if the Separation Problem 3.18 is polynomially 
solvable. 


The theorem implies that with respect to the question of polynomial-time 
solvability, the polyhedral combinatorics approach described in section 1 (i-e., 
studying the convex hull) is, implicitly or explicitly, unavoidable: a combinatorial 
optimization problem is polynomially solvable if and only if the corresponding 
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convex hulls can be described decently, in the sense of the polynomial-time 
solvability of the separation problem. This can be stated also in the negative: if a 
combinatorial optimization problem is not polynomially solvable (perhaps the 
traveling salesman problem), then the corresponding polytopes have no such 
decent description. 

The ellipsoid method does not give a practical method, so Theorem 3.19 is 
more of theoretical value. In some cases, with Theorem 3.19 the polynomial 
solvability of a combinatorial optimization problem was proved, and that then 
formed a motivation for finding a practical polynomial-time algorithm for the 
problem. 

One drawback of the ellipsoid method is that the number of ellipsoids to be 
evaluated depends on the size of the objective vector c. This does not conflict with 
the definition of polynomial solvability, but is not very attractive in practice. It 
would be preferable for the size of ¢ only to influence the sizes of the numbers 
occurring when we perform the algorithm, but not the number of arithmetic 
operations to be performed. An algorithm for Optimization Problem 3.17 is called 
strongly polynomial if it consists of a number of arithmetic operations, bounded 
by a polynomial in [V|+ [FE], on numbers of size bounded by a polynomial in 
|V| + |E] + size(c). Such an algorithm is obviously polynomial-time. 

Interestingly, Frank and Tardos (1985) showed, with the help of the “basis 
reduction method” (Lenstra et al. 1982): 


Theorem 3.20. For any fixed class (¥,,|G graph), if there exists a polynomial-time 
algorithm for Optimization Problem 3.17, then there exists a strongly polynomial 
algorithm for it. 


At the moment of writing, it is not yet clear whether this result leads to 
practical algorithms. 

Finally we note that it is not necessary to restrict &, to collections of subsets of 
the edge set EF. For instance, similar results hold if we consider collections %, of 
subsets of the vertex set V. Moreover, we can consider classes (¥,,|G € Y), 
where & is a subcollection of the set of all graphs. Similarly, we can consider 
classes (¥,,|D directed graph), (¥,,|H hypergraph), (¥,,|@ matroid), and so 
on. 

More on the ellipsoid method can be found in Grdtschel et al. (1988). 

We finally mention the method of Karmarkar (1984) for linear programming; 
this appears to be competitive with the simplex method, but its impact on 
polyhedral combinatorics is not yet clear at the moment of writing. 


4. Total unimodularity 


A matrix is called totally unimodular if each subdeterminant belongs to {0, +1, 
—1}. In particular, each entry of a totally unimodular matrix belongs to {0, +1, 
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~—1}. The importance of total unimodularity for polyhedral combinatorics comes 
from the following theorem (Hoffman and Kruskal 1956). 


Theorem 4.1. Let A be a totally unimodular m X n matrix and let bE Z”. Then 
the polyhedron P:= {x| Ax <b} is integral. 


Proof. Let F={x|A’x=b’} be a minimal face of P, where A’x <b‘ is a 
subsystem of Ax < b. Without loss of generality, A’ =[A, A,], with A, nonsingu- 
lar. Then Aj,‘ is an integral matrix (as det A, = +1), and hence the vector 


= (A) “2 


is an integral vector in FL 6] 


In fact, Hoffman and Kruskal showed that an integral m x n matrix A is totally 
unimodular if and only if for each bE Z”, each vertex of the polyhedron 
{x ER" |x 20, Ax <b} is integral. 

We mention a strengthening of Theorem 4.1 due to Baum and Trotter (1977). 
A polyhedron P in R” is said to have the integer decomposition property if for 
each KEN and for each integral vector z in KP (={kx|x€P}), there exist 
integral vectors x,,...,x, in P so that z=x, +---+-x,. It is not difficult to see 
that each polyhedron with the integer decomposition property is integral. 


Theorem 4.3. Let A be a totally unimodular m X n matrix and let bE Z”. Then 
the polyhedron P:= {x | Ax <b) has the integer decomposition property. 


Proof. Let k EN and zEkPN Z". By induction on & we show that z =x, +--+ + 
x, for integral vectors x,,...,x, in P. By Theorem 4.1, there exists an integral 
vector, say x,, in the polyhedron {x | Ax <b, — Ax <(k — 1)b — Az} {since (i) the 
constraint matrix [{ 4,] is totally unimodular, (ii) the right-hand-side vector 
(4c — 1b az) is integral, and (iii) the polyhedron is nonempty, as it contains ko'z). 
Then z ~— x, € (k — 1)P, whence by induction z — x, =x, +--+: +x,_, for integral 
vectors x,,...,X,.,in P. O 


The following theorem collects together several other characterizations of total 
unimodularity. 


Theorem 4.4. Let A be a matrix with entries 0, +1, and —1. Then the following 
characterizations are equivalent: 

(i) A is totally unimodular, i.e., each square submatrix of A has determinant in 
{0, +1, -1}; 

(ii) each collection of columns of A can be split into two parts so that the sum of 
the columns in one part, minus the sum of the columns in the other part, is a vector 
with entries 0, +1, and —1 only; 
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(ili) each nonsingular submatrix of A has a row with an odd number of nonzero 
components, : 
(iv) the sum of the entries in any square submatrix of A with even row and 
column sums, is divisible by four; 
(v) no square submatrix of A has determinant +2 or —2. 


Characterization (ii) is due to Ghouila-Houri (1962), (iii) and (iv) to Camion 
(1965), and (v) to R. E. Gomory (cf. Camion 1965). 

There are several further characterizations of total unimodularity. By far the 
deepest is due to Seymour (1980) (see chapter 10). For an efficient algorithm to 
test total unimodularity, see Truemper (1982). See also Truemper (1990). 


4.1. Application: bipartite graphs 


It ts not difficult to see that the V x E incidence matrix A of a bipartite graph 
G=(V, E) is totally unimodular: any square submatrix B of A either has a 
column with at most one | (in which case det B € {0, £1} by induction), or has 
two 1’s in each column (in which case det B = 0 by the bipartiteness of G). In 
fact, the incidence matrix of a graph G is totally unimodular if and only if G is 
bipartite. 

The total unimodularity of the incidence matrix of a bipartite graph has several 
consequences, some of which we will describe now. 


Definition 4.5. The matching polytope of a graph G=(V,E) is the polytope 
conv{y"|M matching} in R*. Theorem 4.1 directly implies that the matching 
polytope of a bipartite graph Gi is equal to the set of all vectors x in R* satisfying 


(i) «20, e€E, 
(ii) S eet, vEV 


eDu 


(4.6) 


[since the polyhedron determined by (4.6) is integral]. 


Clearly, the matching polytope of G = (V, E) has dimension |E}. Each inequali- 
ty in (4.6) is facet-determining, except if G has a vertex of degree at most 1. It is 
not difficult to see that the incidence vectors x", y of two matchings M, M’ are 
adjacent on the matching polytope iff MAM’ is a path or circuit, where A 
denotes symmetric difference. Hence, the matching polytope of G has diameter at 
most »(G). (This paragraph holds also for nonbipartite graphs.) 

The above characterization of the matching polytope for bipartite graphs 
implies that for any bipartite graph G=(V,E) and any “weight” function 
c: E-R,: 


maximum weight of a matching = max{c'x|x« 20, Ax <1}, (4.7) 


where A is the incidence matrix of A, 1 denotes an all-one column vector, and 
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where the weight of a set is the sum of the weights of its elements. In particular, 
v(G) = max{t'x|x=0, Ax <1}. (4.8) 


Definition 4.9. The node-cover polytope of a graph G =(V,E) is the polytope 
conv{y”|N node cover} in RY. Again, Theorem 4.1 implics that, if G is 
bipartite, the node-cover polytope of G is equal to the set of all vectors y in RY 
satisfying 


(i) O<y,<1, vEV, 


(ii) yty,21, {o.wEk. ony) 


It follows that for any weight function w: V>R,: 
minimum weight of a node cover = min{w'y|y=0, y'A2=I}, (4.11) 
where A again is the V x E incidence matrix of G. In particular, 
7(G)=min{t'y|y=0, AZ]. (4.12) 


Now, by linear programming duality, we know that problems (4.8) and (4.12) are 
equal, i.e., we have KOnig’s Matching Theorem: v(G) = 1(G) for bipartite G. 

By Theorem 4.3, the matching polytope P of G has the integer decomposition 
property. This has the following consequence. Let k:=A(G) (the maximum 
degree of G). Then (1,...,1)'€R” belongs to KP, and hence is the sum of k 
integer vectors in P. Each of these vectors being the incidence vector of a 
matching, it follows that E can be partitioned into k matchings. So we have 
Konig’s Edge-Coloring Theorem: the edge-coloring number y(G) of a bipartite 
graph G is equal to its maximum degrcc. 

We briefly mention some more examples of the consequences of Theorems 4.1 
and 4.3 to bipartite graphs. 


Definition 4.13. The perfect matching polytope. of a graph G=(V,E). is the 
polytope conv{y™”|M_ perfect matching} in R*. It is a face of the matching 
polytope of G. For bipartite graphs, by (4.6), the Reelect matching polytope is 
determined by 

(i) +«x,21, e€E, 


e 


(ii) Sx,=1, vev. 


Du 


(4.14) 


This is equivalent to a theorem of Birkhoff (1946): each doubly stochastic matrix 
is a convex combination of permutation matrices. 


One easily checks that the incidence vectors x”, x" of two perfect matchings 
M, M’ are adjacent on the perfect matching polytope if and only if MAM’ isa 
circuit (cf. Balinski and Russakoff 1974). So the perfect matching polytope has 
diameter at most 4|V}. The dimension of the perfect matching polytope of a 
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bipartite graph is equal to |E'|—|V|+1, where E’:= UM\(MM), where the 
union and intersection both range over all perfect matchings (see Lovasz and 
Plummer 1986). 


Definition 4.15. The assignment polytope of order n is the perfect matching 
polytope of K,,,. Equivalently, it is the polytope in R””" of all matrices (x 


: ij yi. yi 
satisfying 


(i) x, 20, ijj=l,...in, 
Gi): Dex e ls: Pelee (4.16) 
i=} 
(ii), Doge ds Ise, 
jot 
(Such matrices are called doubly stochastic.) 


Balinski and Russakoff (1974) studied assignment polytopes, proving inter alia 
that they have diameter 2 (if n = 4). See also Balinski (1985), Bertsekas (1981), 
Goldfarb (1985), Hung (1983), Padberg and Rao (1974), and Roohy-Laleh 
(1981). 


Definition 4.17. The stable-set polytope of a graph G=(V,E) is the polytope 
conv{y“|C stable set} in RY. By Theorem 4.1, for bipartite G, it is determined 
by 


(i) O<y,<1, vEV, 


4.18 
(ii) y, ty, <1. fowhek. (4.18) 


So if A is the V x E incidence matrix of the bipartite graph G, and w: V>R, 
is a “weight” function, then 


maximum weight of a stable set = max{w'y|y 20, y'A <1}. (4.19) 
In particular: 
a(G)=max{I'y|y 20, y'A<1'}. (4.20) 


Definition 4.21. The edge-cover polytope of a graph G = (VE) is the polytope 
conv{ x" | F edge cover} in R”. By Theorem 4.1, for bipartite G, it is determined 
by 


(i) O<x,<1, eGE, 
(ii) Dix,21, veV, 


e3v 


(4.22) 


assuming G has no isolated vertices. Hurkens (1991) characterized adjacency on 
the edge-cover polytope, and showed that its diameter is |E| — p(G). 
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From (4.22) it follows that for any “weight” function w: E>R,, 
minimum weight of an edge cover = minfw'x|x =O, Ax 24). (4.23) 
In particular: 
AG) = min{t'x|x=0, Ax > 1). (4.24) 
By linear programming duality, (4.20) and (4.24) are equal, and hence we have 
Konig’s Covering Theorem: a(G) = p(G) for bipartite G. 

By Theorem 4.3, the edge-cover polytope of a bipartite graph has the integer 
decomposition property, implying a result of Gupta (1967): the maximum number 
of pairwise disjoint edge covers in a bipartite graph is equal to its minimum 
degree. : 

Let A be the incidence matrix of the bipartite graph G = (V, E), let w eZ", 
b 2", and consider the lincar programs in the following duality equations: 

(i) max{w'x|x 20, Ax <b} =min({y"b|y 20, y'Azw'}, 


, ; “ 4.25 
(ii) min{w'x|x>=0, Ax > b} = max{y"b| y=0, y'A<w'}. ioe 


By Theorem 4.1, these programs have integer optimum solutions. The special 
case b = 1 is equivalent to the following min—max relations of Egervary (1931): 


(i) the maximum weight of a matching is equal to the minimum value of 
Mee y,, where y: V>Z, such that y, + y, =w, We = {u,vJ EE; 

(4.26) 

(ii) The minimum weight of an edge cover is equal to the maximum value of 


Dvev Yo. Where y: V->Z, such that y, + y, <w, Ve = {u,vJEE. 


Definition 4.27, The transportation polytope for aE R", b ER", is the set of all 


vectors (x, {i=1,...,m, j=1,...,n) in R”™" satisfying 
(i) x,20, G=1,...,m, j=l,....n, 
(ii) Dixj,=4,. f= 1,....m, (4.28) 
im} 


Gi). GAR FET eens 
i=l 


It is related to the Hitchcock-Koopmans transportation problem (Hitchcock 1941, 
Koopmans 1948). Klee and Witzgall (1968) studied transportation polytopes, 
showing that x satisfying (4.28) is a vertex iff {{p,,q,}|x,, >0} contains no 
circuits (where P,,- ~~. Pm> U1>- +++ dn are vertices). Moreover, the dimension is 
(m—1) (n—1) if @ and b are positive (if the polytope is nonempty, i.e., if 
Las ae b,). Botker (1972) and Balinski (1974) showed the Hirsch Conjecture 
for some classes of transportation polytopes. Bolker (1972) and Ahrens (1981) 
studied the number of vertices of transportation polytopes. 
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Definition 4.29. Related is the dual transportation polyhedron, which is, for fixed 
cER”™", defined as the set of all vectors (u;v) in R” < R" satisfying 


u,=0, u,tu,2e,;, E=1,...,m, f=l,...ia. (4.30). 
It is not difficult to see that the dimension is m +n ~— 1, and that (a; v) satisfying 
(4.30) is a vertex iff {{p,,9,}|u;+ v, =¢,} is a connected graph on vertex set 
{Pys- +++ Pms Wy>+++59n}- Balinski (1984) showed that the diameter of (4.30) is 
at most (m-—1) (a~—1), thus proving the Hirsch Conjecture for this class of 
polyhedra. Balinski and Russakoff (1984) made a further study of dual trans- 
portation polyhedra, characterizing vertices and higher-dimensional faces by 
means of partitions. See also Balinski (1983), Ikura and Nemhauser (1983), and 
Zhu (1963). 


4.2. Application: directed graphs 


Total unimodularity also implies several results for flows and circulations in 
directed graphs. Let M be the V x A incidence matrix of a digraph D = (V, A). 
Then M is totally unimodular. Again this can be shown by induction: let B be a 
square submatrix of M. If B has a column with at most one nonzero, then 
det B € {0, +1} by induction. If each column of B contains a +1 and a —1, then 
det B=0. 

There are the following consequences. 


Definition 4.31. Let D =(V, A) be a digraph, let r,s EV, and let cER4 be a 


“capacity” function. Then the r—s-flow polytope is the set of all vectors x in R“ 
satisfying 


(i) O<x,<c,, aA, 
(ii) By X,= 2 X,, vEV, vers. 


acs (uv) ai *(v) 


(4.32) 


Any vector x satisfying (4.32) is called an r-s-flow (under c). By the total 
unimodularity of the incidence matrix of D, if c is integral, then the r—s-flow 
polytope has integral vertices. Hence, if c is integral, the maximum value 
(:= vgcs (Xa ~ Yuacs-() Xa) Of an r-s-flow under c is attained by an integral 
vector (Dantzig 1951b). 


4.33 (Max-Flow Min-Cut Theorem). By LP duality, the maximum value of an 
r-s-flow under c is equal to the minimum value of <4 Yaa, where yER4 is 
such that there exists a vector z in R" satisfying 


(i) y,-z,+2z,20, a=(v,w)EA, 


4.34 
(ii) z,=1, z,=0. C2) 


Again, by the total unimodularity of the incidence matrix of D, we may take the 
minimizing y, z to be integral. Let W:={v EV [z, 21}. Then for a=(v, w)€ 
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5*(W) we have y, =z, ~z, 21, and hence 


2, Ya Pica > Ya LN 2: Ca (4.35) 
aca aEb (W) acé (W) Pi 
So the maximum flow value is not less than the capacity of cut 5*(W). Since it 


also cannot be larger, we have Ford and Fulkerson’s Max-Flow Min-Cut 
Theorem. 


Definition 4.36. Given digraph D = (V, A) and r,s © V, the shortest-path polytope 
is the convex hull of all incidence vectors y” of subsets P of A, being a disjoint 
union of an r—s-path and some directed circuits. By the total unimodularity of the 
incidence matrix of D, this polytope is equal to the set of all vectors x © R* 
satisfying 


(i) O<x,<1, a€A, 


(i) D2 x,= YD x,, vEV,v¥rs, (4.37) 
a&b '(v) aed (v) 

(ii) SY x- D x, =1. 
acs *(r) acd (r) 


So it is the intersection of an r—s-flow polytope with the hyperplane determined 
by (iii). Saigal (1969) showed that the Hirsch Conjecture holds for the class of 
shortest-path polytopes. 


Definition 4.38. For digraph D = (V, A) and /, u ER“, the circulation polytope is 
the set of all circulations between / and u, i.e., vectors x € R* satisfying 
(i) dsx,<u,, aa, 


(ii) Mx=0, a9) 


where M is the incidence matrix of D. By the total unimodularity of M, if J and u 
are integral, then the circulation polytope is integral. So if / and u are integral, 
and there exists a circulation, there exists an integral circulation. Similarly, a 
minimum-cost circulation can be taken to be integral. 


By Farkas’s Lemma, the circulation polytope is nonempty iff there are no 
vectors z, wER“, ye RY satisfying 
(i) z,w20, 
(ii) z-wt+M'y=0, (4.40) 
(iii) u'z—I'w<o0. 
Suppose now / <u, and (4.40) has a solution. Then there is also a solution 


satisfying 0<y<1, and hence, by the total unimodularity of M, there is a 
solution z, w, y with y a {0, 1}-vector. We may assume that z,w, = 0 for each arc 
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a. Then, for W:= {vEV[y, = 1}, 


> u- D> L=utz-lw<o. (4.41) 
ues (W) Gb *(W) . 4 
Thus we have Hoffman’s Circulation Theorem (Hoffman 1960): there exists a 
circulation x satisfying /<x<u iff /<u and there is no subset W of V with 
aes-(w) 4a < eerren I. 


4.42. More generally, for /,u€ R* and b’, b”ER”, the polyhedron 
{,ER*|/ <x <u, b'’<Mx<b"} (4.43) 


is integral, if /, u, b’ and b” are integral. Moreover, the total unimodularity of M 
yields a characterization of the nonemptiness of the polyhedron (4.43), extending 
Hoffman’s Circulation Theorem. 


It is not difficult to see that (4.43) is an affine transformation of the polytope of 
vectors (x’; x"; y’; y”) in R* x R* « RY x R” satisfying 
x, 20, x20, acA, 
y, 20, yi20, veV, 
DV xt Do attyh=o"+ SY u- S 1, veEev, (4.44) 
aeé '(v) aeb (vy aed (v) acd *(v) 
x, txisu,-l,, a€Aa, 
yietyl=bi- bi, vEeVv 


(the transformation is given by x,:=x/,+1,). Thus (4.43) is transformed into a 
face of the transportation polytope (4.27). In this way, several results for (4.43) 
can be derived from results for transportation polytopes. 

Let D =(V, A) be a directed graph, and let TC A be a spanning tree in D. 
Consider the T x (A\T) matrix N defined, for a € T and a’ = (v, w) € ANT, by: 


0 if a does not occur in the v—w path in T , 
Nag i= 47 if@ occurs forward in the v—w path in T, (4.45) 
+1. if @ occurs backward in the v-w path in T. 


Then N is totally unimodular, as can be seen with the help of Ghouila—Houri’s 
characterization (4.4) (ii). A vector x = (2:) in R*’? x R? satisfies Mx = 0 (where 
M is the incidence matrix of D) if and only if x"=Nx'. Thus (4.39) can be 
reformulated as 


[,Sx),Su,, a€ A\T, 


(4.46) 
1,<(Nx'),<u,, a€T. 


By the total unimodularity of N, the polytope determined by (4.46) has integer 
vertices, if all 1, and uw, are integer. 
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A special case is formed by the {0,1)-matrices with the consecutive ones 
property: in each column, the 1’s form an interval (fixing some ordering of the 
rows, as usual). This special case arises when T is a directed path, and each arc in 
A\T forms a directed circuit with some subpath in T. 

For related results, see also Hoffman (1960, 1979). 


5. Total dual integrality 


Total dual integrality appears to be a powerful technique in deriving min—max 
relations and the integrality of polyhedra. It is based on the following result, 
shown, implicitly or explicitly, by Gomory (1963), Lehman (1965), Fulkerson 
(1971), Chvatal (1973a), Hoffman (1974) and Lovasz (1976) for pointed poly- 
hedra, and by Edmonds and Giles (1977) for general polyhedra. 


Theorem 5.1. A rational polyhedron P is integral uf and only if each rational 
supporting hyperplane of P contains integral vectors. 


Proof. Since the intersection of a supporting hyperplane with P is a face of P, 
necessity of the condition is trivial. To prove sufficiency, suppose that each 
rational supporting hyperplane of P contains integral vectors. Let P = {x | Ax < 
b}, with A and b integral. Let F = {x| A’x = b’} be a minimal face of P, where 
A'x <b’ is a subsystem of Ax <b. If F does not contain any integral vector, there 
exists a vector y such that c':= y'A’ is an integral vector, while 6 := y'b’ is not 
an integer (this follows, ¢.g., from Hermite’s Normal Form Theorem). We may 
suppose that all entries in y are nonnegative (we may replace each entry y, of y by 


y,— Ly,]). Now H := {x|c'x = 68} is a supporting hyperplane of P, not containing 
any integral vector. O 


Note that the special case where P is pointed can be shown without appealing 
to Hermite’s Theorem: if x* is a nonintegral vertex of P, w.l.o.g. xt € Z. There 
exist supporting hyperplanes HW = (x {elx =clx*} and Hf = {x| é'y = &'x*} touch- 
ing P in x* such that c and é are integral and such that c' — é' =(1,0,...,0). If 
both H and A contain integral vectors, we know c'x*E€Z and é'x* EZ. 
However, (c - é)'x* =x* £Z. 

Theorem (5.1) can be applied as follows. Consider the LP problem 


max{c'x| Ax <b}, (5.2) 


for rational matrix A and rational vectors b,c. 


Corollary 5.3. The following are equivalent: 

(i) the maximum value in (5.2) is an integer for each integral vector c for which 
the maximum is finite; 

(ii) the maximum (5.2) is attained by an integral optimum solution for each 
rational vector c for which the maximum is finite; 
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(iii) the polyhedron {x | Ax <b} is integral. 
Now consider the LP-duality equation 
max{c'x| Ax <b} = min{y'b|y=0, y'A=c'}. (5.4) 


Clearly, we may derive that the maximum value is an integer if we know that the 
minimum has an integral optimum solution and 6 is integral. This motivated 
Edmonds and Giles (1977) to define a system Ax <b of linear inequalities to be 
totally dual integral (T1D1) if for each integral vector c, the minimum in (5.4) is 
attained by an integral optimum solution. Then we have the following conse- 
quence. 


Corollary 5.5. Let Ax = b by a system of linear inequalities, with A rational and b 
integral. If Ax <b is TDI (i.e., the minimum in (5.4) is attained by an integral 
optimum solution y, for each integral vector c for which the minimum is finite), 
then {x | Ax <b} is integral (i.e., the maximum in (5.4) is attained by an integral 
optimum solution x, for each c for which the maximum is finite). 


Note that the notion of total dual integrality is not symmetric in objective 
function ¢ and right-hand-side vector b. Indeed, the implication in Corollary 5.5 
cannot be reversed: the system x,20, x, +2x,20 determines an integral 
polyhedron in R’, while it is not TDI. However, Giles and Pulleyblank (1979) 
showed that if P is an integral polyhedron, then P= {x|Ax<b} for some 
TDI-system Ax <b with b integral. In Schrijver (1981) it is shown that if P is 
moreover full-dimensional, then there is a unique minimal TDI-system determin- 
ing P with A and 6 integral (minimal under deleting inequalities). 

Related to total dual integrality is the notion of Hilbert basis: This is a 
collection {a,,...,4a,,} of vectors with the property that if an integer vector x is a 
nonnegative linear combination of the vectors a,,...,4,,, then it is an integer 
nonnegative linear combination of them. 

The relation to total dual integrality is as follows. Let Ax = b be a system of 
linear inequalities, and set P:= {x|Ax <b}. If a'x < is an inequality from 
Ax <b and F is a face of P, we say a is tight in F if a'x = B for all x in F. Now 
Ax <b is TDI if and only if for each face F of P, the rows of A that are tight in A 
form a Hilbert basis. 

it was shown by Cook et al. (1986a) that if {@a,,...,4,,} is a Hilbert basis 
consisting of integer vectors in R”, then any integer vector x that is a nonnegative 
linear combination of @,,...,@,, is im fact an integer nonnegative linear 
combination of at most 2n — 1 of these vectors. 

As a consequence one has that if Ax = b is TDI (in n variables, say), and A is 
integral, then for any c € 2”, min{y"b| y =0, yA =c’} is attained by an integer 
vector y with at most 2n — 1 nonzero components (if the minimum is finite). 

For more on total dual integrality, see Cook (1983, 1986), Edmonds and Giles 
(1984), and Cook et al. (1984). 


me? 
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We now consider some combinatorial applications of total dual integrality. 


Application 5.6 (Arborescences). Let D = (V, A) be a directed graph, and let r be 
a fixed vertex of D. An r-arborescence is a set A’ of |V|—1 arcs forming a 
spanning tree such that cach vertex v #r is entered by exactly one arc in A’. So 
for each vertex u there is a unique directed path in A’ from r to v. An r-cut is an 
arc set of the form 6° (U), for some nonempty subset U of V\{r}. As usual 
5 “(U) denotes the set of arcs entering U. 


It is not difficult to sce that 7-arborescences arc the inclusion-wisc minimal scts 
of arcs intersecting r-cuts. Conversely, the inclusion-wise minimal r-cuts are the 
inclusion-wise minimal sets of arcs intersecting all r-arborescences. 

Fulkerson (1974) showed: 


Fulkerson’s Optimum Arborescence Theorem 5.7. For any “length” function 
1: A>Z,, the minimum length of an r-arborescence is equal to the maximum 
number t of r-cuts C,,...,C, (repetition allowed) so that no arc a is in more than 
l(a) of these cuts. 


This result can be formulated in polyhedral terms as follows. Let C be the 
matrix whose rows are the incidence vectors of all r-cuts. So the columns of C are 
indexed by A, and the rows by the collection #:= {U|6#U CV\{r}}. Then 
Theorem 5,7 is equivalent to both optima in the LP-duality equation 


min{I'x |x =0, Cx = 1} = max{y"l| y =0, y'C</"} (5.8) 


having integral optimum solutions, for cach /€ Z¢. So in order to show the 
theorem, by (5.5) it suffices to show that the maximum in (5.8) has an integral 
optimum solution, for cach /: A-> Z, i.c., that the system x =0, Cx 21 is TDI. 
This can be proved as follows (Edmonds and Giles 1977). 


Proof of Theorem 5.7. Note that the matrix C is generally not totally unimodular. 
However, in order to prove that the maximum (5.8) has an integer optimum 
solution, it suffices to show that there exists a “basis” that is totally unimodular 
and that attains the maximum. That is, it is enough to find a totally unimodular 
submatrix C' of C (consisting of rows of C) such that 


max{y'I| y=0, y'C</'} =max{z'I[z20,2'°C’ <I}. (5.9) 


Since the second maximum is attained by an integer optimum solution z (by the 
total unimodularity of C'), extending z by 0’s in the appropriate positions gives 
an integer optimum solution y for the first maximum. 

How can we find such a C'? The key observation is the following. Call a 
subcollection ¥ of H laminar if for all T, UG F one has TCU or UCT or 
TOU =. Then, if C’ is the matrix consisting of the rows of C with index in 
some laminar family ¥, C’ is totally unimodular. 
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This last fact can be derived with Ghouila-Houri’s characterization (4.4) (ii). 
Choose a set of rows of C’, i.e., choose a subcollection of ¥%. Define, for cach U 
in Y, the “height” h(U) of U as the number of sets T in & with T DU. Now split 
4 into G gq and &,,.,,, according as h(U) is odd or even. One easily derives from 
the laminarity of & that for any arc a of D, the number of scts in G,,,, entered by 
a, and the number of sets in G_,.,, entered by a, differ by at most 1. Therefore, we 
can split the rows corresponding to G- into two classes fulfilling Ghouila-Houri’s 
criterion. So C’ is totally unimodular. 

So it suffices to find a laminar subcollection ¥ or % so that the corresponding 
matrix C’ satisfies (5.9). This can be done as follows. We may assume that all 
components of / are nonnegative. (If some component is negative, the maximum 
in (5.8) is infinite.) Choose a vector y that attains the maximum in (5.8), and for 
which 


> yy |Ul-|vvu| (5.10) 
UEX 


is as small as possible. Such a vector y exists by compactness arguments. 
Define 


F:= {Ul yy, >0}. (5.11) 


Then ¥ is laminar. To see this, suppose there are T, U € ¥ with TZU T and 
TOU #¥®. Let ¢€:=min{y,, y,,} >0. Next reset: 


el aa Yrnu?= Yrou + & 5 (5.12) 

Yus=Yu TE, Yruu'=Yruu te, 
while y does not change in the other coordinates. By this resetting, y'C does 
not increase in any coordinate (since £-y’? “+e-y*? “Yse- ? inane 


ey? ey. while y'l docs not change. However, the sum (5.10) did decrease, 
contradicting the minimality of (5.10). This shows that ¥ is laminar. 

We finally show that (5.9) holds. The inequality =< is trivial, since C’ is a 
submatrix of C. The inequality = follows from the fact that the vector y above 
attains the second maximum in (5.9), while y has 0’s in the positions corre- 
sponding to rows of C not in C’. O 


A direct consequence is that the r-arborescence polytope of D =(V, A) (being 
the convex hull of the incidence vectors of r-arborescenccs) is determined by 


0<x,<1, aEA, 
(5.13) 
> x, 21, OAUCV\(r}. 
ae3~(U) 
This is a result of Edmonds (1967). It follows, with the ellipsoid method, that a 
minimum-length r-arborescence can be found in polynomial time if and only if we 
can test (5.13) in polynomial time. This last is indeed possible: given x € Q“, we 
first test if O<x,<1 for each arc a; if x,<0 or x, >1 for some a, we have a 


1676 A. Schrijver 


separating hyperplane. Otherwise, consider x as a capacity function on the arcs of 
D, and find an r-cut C of minimum capacity (with an adaptation of Ford and 
Fulkerson’s algorithm): if C has capacity at least 1, then (5.13) is satisfied; 
otherwise, C yields a hyperplane separating x from the polyhedron determined by 
(5.13). 

For a characterization of the facets of the r-arborescence polytope, see Held 
and Karp (1970) and Giles (1975, 1978). 

One similarly shows that for any directed graph D =(V, A), the following 
system, in x€ R*, is TDI: 


x,20, a&A, 
> x,21, O@¥UCV, &*(U)=8, 


aes (U) 


(5.14) 


which is a result of Lucchesi and Younger (1978). It is equivalent to: 


Lucchesi-Younger Theorem 5.15. The minimum size of a directed-cut covering in 
a digraph D = (V, A) is equal to the maximum number of pairwise disjoint directed 
cuts. 


Here a directed cut is a set of arcs of the form 6 (U) with 9#U #V, 
5 *(U) =9. A directed-cut covering is a set of arcs intersecting each directed cut, 
or equivalently, a set of arcs whose contraction makes the digraph strongly 
connected. 

Note that the Lucchesi- Younger Theorem is of a self-refining nature: it implies 
that for any “length” function /: A> Z,, the minimum length of a directed-cut 
covering is equal to the maximum number ¢ of directed cuts C,,...,C, 
(repetition allowed), so that no arc a is in more than /(a) of these cuts. [To derive 
this from Theorem 5.15, replace cach arc a by a directed path of length ((a).] In 
this weighted form, the Lucchesi-Younger Theorem is easily seen to be equiva- 
lent to the total dual integrality of (5.14). 


Application 5.16 (Polymatroid intersection). Let S be a finite set. A function 
f: P(S)—R is called submodular if 


MT) +fU)2Zf(TOU)+f(TUU) forall T,UCS. (5.17) 


There are several examples of submodular functions. For example, the rank 
function of any matroid is submodular (see chapters 9 and 11). 

Let f,, f, be two submodular functions on S, and consider the following system 
in the variable x E R*: 


(i) x,=0, ses, 
(i) 2 x,<f(U), UCS, (5.18) 


seu 


(ii) > x,<f(U), UCS. 


sEU 


OLE LCT ALLOA CD LOA TE COLE 
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Edmonds (1970, 1979) proved: 
Theorem 5.19. System (5.18) is TDI. 


Proof. The proof is similar to that of Theorem 5.7. Let c: SZ, and consider 
the LP problem dual to maximizing c'x over (5.18): 


min D yyfU)+ D zyfU)| yz; D (vu tzu’ =e}. 
UCcS 7 UucS UCcS 


(5.20) 


We show that this minimum has an integral optimum solution, by a version of the 
“uncrossing’’ technique. Let y, z attain this minimum, so that 


> (vu tzu) {UI [Su (5.21) 
UcS 
is as small as possible. Let 


F:={UCS|y, >0}. (5.22) 


We show that ¥ forms a chain with respect to inclusion. Suppose not. Let T, 
U EF with TZU T. Let «:=min{y;, yy} > 0. Next reset as in (5.12). Again, 
the modified y forms, with the original z, an optimum solution of (5.20) [since 
ety ay ey and §(T) + f,(U)=h(TOU)+f(TUU)]. However, 
(5.21) did decrease, contradicting its minimality. This shows that #& forms a 
chain. Similarly, 


G:=({UCS|z,, >0} (5.23) 


forms a chain. 
Now (5.20) is equal to 


min{ > yuh (U) + By zuf(U)| VERT, zER®,; 
UEc# UES 


2 vox + = zyx" =e} (5.24) 
UEF UvEg 
since y, z attain (5.20), using (5.22) and (5.23). 

The constraint matrix in (5.24) is totally unimodular, as can be derived easily 
with Ghouila-Houri’s criterion (4.4) (ii). Hence (5.24) has an integral optimum 
solution y, z. By extending y, z with 0-components, we obtain an integral 
optimum solution of (5.20). O 


This result has several corollaries, as we shall see. If f, and f, are integer-valued 
submodular functions, then the total dual integrality of (5.18) implies that (5.18) 
determines an integral polyhedron. In particular, let f, and f, be the rank 
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functions of two matroids (S, ¥,) and (S, 4,). Then the following result of 
Edmonds (1970) follows. 


Corollary 5.25. The polytope conv{y'|1€ 4, F,} is determined by (5.18). 


Proof. Note that an integral vector satisfies (5.18) iff it is equal to x! for some I 
in #ANS,. O 


A special ease is that if we have one matroid (S,.4), with rank function, say, f, 
then its independence polytope (=conv.{y'|1E4}) is determined by x, =0, 
s€S; Yoey x, <f(U), U CS (Edmonds 1971). So Corollary 5.25 concerns the 
intersection of two independence polytopes. The facets of independence poly- 
topes, and of the intersection of two of them, are described by Giles (1975). 
Hausmann and Korte (1978) characterized adjacency on the independence 
polytope. See also Edmonds (1979) and Cunningham (1984). 

Another direct consequence for matroids is: 


Edmonds’ Matroid Intersection Theorem 5.26. The maximum size of a common 
independent set of two matroids (S, #,) and (S, 4) is equal to minycs (f,(U) + 
f(S\U )), where f, and f, are the rank functions of these matroids. 


Proof. By Corollary 5.25, the maximum size of a common independent set is 
equal to max{1'x |x satisfies (5.18)}, and hence, by the total dual integrality of 
(5.18), to 


minh >, (hW)+ 2uf Uy 2E Zs & (yy + zu)x" = i}. 


It is not difficult (using the nonnegativity, the monotonicity and the submodularity 
of f, and f,) to derive that this last minimum is cqual to the minimum in Theorem 
5.26. O 


For more consequences of Theorem 5.19, we refer to chapter I1. 

The proofs of Theorems 5.7 and 5.19 given above are examples of a general 
proof technique for total dual integrality studied by Edmonds and Giles (1977). 
First show that there exists an optimum dua! solution whose nonzero components 
correspond to a “nice” colelction of sets (e.g., laminar, a chain, “cross-free’’). 
Next prove that such nice collections yield a restricted linear program with totally 
unimodular constraint matrix. Finally, appeal to Hoffman and Kruskal’s Theorem 
to deduce the existence of an integral optimum dual solution for the restricted, 
and hence for the original, problem. 

We now illustrate how total dual integrality helps in showing one of the 
pioneering successes of polyhedral combinatorics, the characterization of the 
matching polytope by Edmonds (1965). For the basic theory on matchings we 
refer to chapter 3. 
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Definition 5.27. The matching polytope of an undirected graph G =(V, E) is the 
polytope conv{y™ | M matching} in R®. Edmonds showed that this polytope is 
equal to the set of all vectors x in R® satisfying 


(i) «20, eGE, 
(ii) Dix,.<1, veEV, (5.28) 


eDu 


Gi) 2 x. <LUll, UCV. 
CU 
Since the integral vectors satisfying (5.28) are exactly the incidence vectors y™ of 
matchings M, it suffices to show that (5.28) determines an integral polyhedron. In 
fact, Cunningham and Marsh (1978) showed: 


Theorem 5.29. System (5.28) is TDI. 


This implies that for each w: E-> Z, both optima in the LP-duality equation 


max{w'x |x satisfies (5.28)} 


=min{ > y, + Ss zy Lalu lyeRY,zERe™, 
ucV 


vEeVv 


VeEE: Ss y, + » z,20,| (5.30) 
vce Ude 
are attained by integral optimum solutions. It means: for each undirected graph 
G=(V, E) and for each “weight” function w: E> Z 


max{w(M)|M matching} 


= min} > yt > zy l$lUll lyezy,zeze™; 
veV UeV 


VeekE: > y, + Py z,20,| (5.31) 
voce aloe 
Here w(E’):= Vic, w, for any subset E’ of E. [Note that (5.31) contains the 
Tutte—Berge formula as special case, by taking w = 1.] 


Proof of ‘Theorem 5.29. We may assume that w is nonnegative, since replacing 
any negative component of w by 0 does not change any optimum in (5.31). 

For any w, let v, denote the left-hand term in (5.31). It suffices to show that v, 
is not less than the right-hand term in (5.31) (since < is trivial). Suppose (5.31) 
does not hold, and suppose we have chosen G = (V, E) and w: EZ, so that 
|V| + |E] + w(E) is as small as possible. Then G is connected (otherwise, one of 
the components of G will form a smaller counterexample) and w, = 1 for each 
edge ¢ (otherwise we could delete e). Now there are two cases. 

Case 1. There exists a vertex v covered by every maximum—weighted match- 


1680 A. Schrijver 


ing. In this case, let w’E Z* arise from w by decreasing the weights of edges 
incident to v by 1. Then v,. =v, —1. Since w'(E)<w(E), (5.31) holds for w’. 
Increasing component y, of the optimal y for w’ by 1, shows (5.31) for w. 

Case 2. No vertex is covered by every maximum-weighted matching. Now let 
w’ arise from w by decreasing all weights by 1. We show that v, = v,. + [4|V[]. 
This will imply (5.31) for w: since w’(E) < w(E), (5.31) holds for w’. Increasing 
component z, of the optimal z for w’ by 1, shows (5.31) for w. 

Assume v, <v,, + [4|V|], and let M be a matching with v,. = w'(M), such that 
w(M) is as large as possible. Then M leaves at least two vertices in V uncovered, 
since otherwise w(M) = w'(M) + [4|V|j, implying », 2 w(M)=w'(M) + LEVI] 
=p, + |4|V|]. 

Let u and vu be not covered by M, and suppose we have chosen M, u and v so 
that the distance d(u,v) in G is as small as possible. Then d(u,v) > 1, since 
otherwise augmenting M by {u,v} would increase w(M). Let ¢ be an internal 
vertex of a shortest path between u and v. Let M’ be a matching with w(M’) = », 
not covering ¢. 

Now M A M' is a disjoint union of paths and circuits. Let P be the set of edges 
of the component of M AM’ containing t. Then P forms a path starting in ¢ and 
not covering both u and uv (as t, u and v cach have degree at most 1 in MAM’). 
Say P does not cover u. Now the symmetric difference M A P is a matching with 
{M A P| <|M|, and therefore 


w'(M AP) ~w'(M) = w(M AP) ~|MA P| — w(M) + |M| 
> w(M AP) — w(M) = w(M’) — w(M"A P)=0. 
(5.32) 


Hence v,. = w’(M AP) and w(M A P) = w(M). However, M A P does not cover t 
and uw, and d(u, t) <d(u,v), contradicting our choice of M, u, and v. 0 


So (5.28) is TDI. A consequence is the following fundamental result of 
Edmonds (1965). 


Edmonds’ Matching Polyhedron Theorem 5.33. The matching polytope of a graph 
is equal to the polyhedron determined by (5.28). 


In fact, Edmonds found Theorem 5.33 as a by-product of a polynomial-time 
algorithm for finding a maximum-weighted matching. In turn, with the ellipsoid 
method, Padberg and Rao (1982) showed that Theorem 5.33 yields a polynomial- 
time algorithm finding'a maximum-weighted matching, see (5.37) below. 


5.34. A consequence of Theorem 5.33 is a characterization of the perfect 
matching polytope of a graph G=(V,E), which is the polytope conv{y™”|M 
perfect matching} in R*. This polytope clearly is a face of the matching polytope 
of G (or is empty), viz. the intersection of the matching polytope with the 
(supporting) hyperplane (xER*| Yc, x, =4|V]}. It follows that the perfect 
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matching polytope is determined by the following inequalities: 
(i) x20, eE€E, 


(ii) Di x,=1, veV,. (5.35) 


eDu 


(iii) > x,21, UCV, |U| odd. 


e€d(U) 


(Note that (ii) and (iii) imply (5.28) (iii).) 


From the description (5.35) of the perfect matching polytope one can derive 
with the ellipsoid method a polynomial-time algorithm for finding a maximum- 
weighted perfect matching (and through this a maximum-weighted matching). It 
amounts to showing that it can be tested in polynomial time whether a vector x 
satisfies (5.35). Padberg and Rao (1982) showed that this can be done as follows. 

For a given x € Q* we must test if x satisfies (5.35). The inequalities in (i) and 
(ii) can be checked one by one. If one of them is not satisfied, it gives us a 
separating hyperplane. So we may assume that (i) and (ii) are satisfied. If |V| is 
odd, then clearly (iii) is not satisfied for U:=V. So we may assume that |V{ is 
even. We cannot check the constraints in (iii) one by one in polynomial time, 
simply because there are exponentially many of them. Yet, there is a polynomial- 
time method of checking time. First, note that from Ford and Fulkerson’s 
max-flow min-cut algorithm we can easily derive a polynomial-time algorithm 
having the following as input and output: 


Input: Subset W of V. 
Output: Subset T of V such that WMT 49% W\T and such that x(6(7)) 
is as small as possible. (5.36) 


Here x(E'):= )i.¢,.%, for any subset E’ of E. We next describe recursively an 
algorithm with the following input and output specification: 


Input: Subset W of V with |W| even. 
Output: Subset U of V such that |WMU| is odd and such that x(5(U)) 
is as small as possible. (5.37) 


First, we find with algorithm (5.36) a subset T of V with WNT 4##4W\T and 
with x(6(T)) minimal. If |WM T| is odd, we are done. If |WNMT| is even, call, 
recursively, the algorithm (5.37) for the inputs WOT and WN T, respectively, 
where T:=V\T. Let it yield a subset U' of V such that |WMTMU'| is odd and 
x(6(U')) is minimal, and a subset U” of V such that |WA TAMU" is odd and 
x(6(U")) is minimal. Without loss of generality, WO Tg U’ (otherwise replace U’ 
by V\U'), and WNT ZU" (otherwise replace U” by V\U"). 

We claim that if x(6(TMU')) <x(6(T OU")), then U:= TOU’ is output of 
(5.37) for input W, and otherwise U := TMU". To see that this output is justified 
suppose to the contrary that there exists a subset Y of V such that |W Y| is odd, 
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and x(5(Y)) <x(8(T MU") and x(5(Y)) <x(5(7 OU"). Then either WA YNT| 
is odd or |WAYNT| is odd, 

Case t. |WONYNMT| is odd. Then x(6(Y)) = x(6(U’)), since U' is output of 
(5.37) for input WMT. Moreover, x(6(T UU') = x(6(7)), since T is output of 
(5.36) for input W, and since WN(TUU')4##W\(T UU’). Therefore, we 
have the following contradiction: 


x(5(Y)) = x(5(U")) = x(8(T NU") + x(8(T UU") ~ x(6(T)) 
= x(6(T NU’)) > x(8(¥)) (5.38) 


[the second inequality follows since x(5(A)) + x(6(B)) = x(6(A N B)) + 
x(5(A U B)) for all A, BCV}. 

Case 2. |WNYNT| is odd: similar. 

Given the polynomial speed of the algorithm for (5.36), it is not difficult to see 
that the algorithm described for (5.37) is also polynomial-time. As a conse- 
quence, we can test (5.35) (iii) in polynomial time. 

Further notes on TDI: for a deep characterization of certain TDI systems, see 
Seymour (1977). For an application of TDI to non-optimizational combinatorics 
(viz. Nash—Williams Orientation Theorem), see Frank (1980), and Frank and 
Tardos (1984). 


6. Blocking polyhedra 


Another useful technique in polyhedral combinatorics is a variant of the classical 
polarity in Euclidean space, viz. the blocking relation between polyhedra. It was 
introduced by Fulkerson (1970a, 1971), who noticed its importance to com- 
binatorics and optimization. Often, with the theory of blocking polyhedra, one 
polyhedral characterization (or min-max relation) can be derived from another, 
and conversely. 

The basic idea is the following result. Let c,,...,¢,,.d,,...,d,ER", satisfy 


conv{c,,....¢,} +R) = {xR |djx21 forj=1,...,0}. (6.1) 
Then the same holds after interchanging the c, and d;: 
conv{d,,...,d@,} +R" ={xER" |c/x>1 fori=1,..., my}. (6.2) 


In a sense, in (6.2) the ideas of “vertex” and “facct” are interchanged as 
compared with (6.1). The proof is a simple application of Farkas’s Lemma. 


Theorem 6.3. For any c,,...,¢,,, d;,...,d4,ER,, (6.1) holds if and only if 
(6.2) holds. 


Proof. Suppose (6.1) holds. Then C in (6.2) is direct, since cid, = 1 for alli, j, as 
the c, belong to the right-hand side in (6.1), and since c =0. 
To show D in (6.2), suppose x @conv{d,,...,d,} + R.: Then there exists a 
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separating hyperplane, i.e., there is a vector y such that 
y'x >min{y'z|z€conv{d,,...,d,} +R"). ; (6.4) 


We may assume t= 1 [since if t= 0, then (6.1) gives that 0€ {c,,...,c,,}, and 
therefore x docs not belong to the right-hand side of (6.2)]. By scaling y, we can 
assume that the minimum in (6.4) is 1. Therefore, y belongs to the right-hand 
side of (6.1), and therefore to the left-hand side. So y2A,c, +++: +A,,c,, for 
certain A,,...,A,, 20 with A; + -+-+A,, = Lb Since y'x <1, it follows that c;x < 
1 for at least one i. Hence x does not belong to the right-hand side of (6.2). 
This shows (6.1) > (6.2). The reverse implication follows by symmetry. O 


This theorem has the following consequences. For any X CR”, we define the 
blocker B(X) of X by: 


B(X):= (x ER" | y’x = 1 foreach y in X}. (6.5) 
Clearly, for c,,...,¢,,&R', if P is the polyhedron 

P:=conv{c,,...,¢,} +R‘, (6.6) 
then 

B(P) = (x ER" Jclx = 1 fori=1,...,m). (6.7) 


So B(P) is also a polyhedron, called the blocking polyhedron of P. lf R= B(P), 
the pair P, R is called a blocking pair of polyhedra. By the following direct 
corollary of Theorem 6.3, this is a symmetric relation. 


Corollary 6.8. For any polyhedron of type (6.6), B(B(P)) = P. 


So both (6.1) and (6.2) are equivalent to: 


the pair conv{c,,...,c,,} +R", and conv{d,,...,d,} +R’, forms 
a blocking pair of polyhedra. (6.9) 


The following corollary shows the equivalence of certain min-max relations. 


Corollary 6.10. Let c,,...,¢,,. d,,...,d,ER". Then the following are equiva- 
lent: 


(i) for each LER": min{I"c,,...,1"c,,} 
=max{ tees FAA. ..,A,ER,3 ad,<t}; 
j 


(6.11) 
(ii) or each wER’,: min w'd,,...,w'd 
1 t 
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= max] py +o | Hes By ERY SD me, sw}. 
(6.12) 


Proof. By LP duality, the maximum in (6.11) is equal to min (I'x|xER"; 
d}x>1 forj=1,...,t}. Hence, (6.11) is equivalent to (6.1). Similarly, (6.12) is 
equivalent to (6.2). Therefore, Theorem 6.3 implies Corollary 6.10. 0 


Note that by continuity, in (6.11) we may restrict / to rational, and hence to 
integral vectors, without changing the condition. Similarly for (6.12). This is 
sometimes useful when showing one of them by induction. 

A symmetric characterization of the blocking relation is the ‘tength—width 
inequality” given by Lehman (1965): 


Lehman’s Length-Width inequality 6.13. Let c,,...,c¢,,.d,,.-.,d,ER‘. Then 


ohm? 


(6.1) [equivalently (6.2), (6.11), or (6.12)] holds if and only if 


i) d'c,21 foralli=1,.... mand j=1,..., t 
(i) jc, 21 fe j (6.14) 


(ii) min{I"c,,...,lc,,}-min{w'd,,...,w'd,} <I'w for all l_wEZ", . 


m 


Proof. Suppose (6.14) holds. We derive (6.11). Let ER’. By LP duality, the 
maximum in (6.11) is equal to min{/"x|x ER"; d/x 21 for j=1,..., 0}. Let 
this minimum be attained by vector w. Then by (6.14) 


Iw > (min ["c,)(min w'd,)=min I", =I" . 
t i t 


So the minimum in (6.11) is equal to /'w. 

Next, suppose (6.1) holds. Then (6.11) and (6.12) hold. Now (6.14) (i) follows 
by taking /=d, in (6.11). To show (6.14) (ii), let Ay... 4A), My, -. +) My attain 
the maxima in (6.11) and (6.12). Then 


4: 
(> a(S 1) > py > A Bi = > 2 Amd ¢; = (= Ad) (> c;) 
i . yf ied j i 
<l'w. 
This implies (6.14) (ii). 
It follows from the ellipsoid method that if c,,...,¢,,5 
(6.1) [equivalently, (6.2), (6.11), or (6.12)], then 


for each FER": min{I'c,,...,1"c,,} can be found in polynomial time 
if and only if 

for each WER’: min{w'd,, ...,w'd,} can be found in polynomial 
time. (6.17) 


d,,...,d,ER', satisfy 
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This is particularly interesting if ¢ or m_is exponentially large (cf. the applications 
below). 

For more on blocking (and anti-blocking) polyhedra, see Ardoz (1973), Araoz 
et al. (1983), Bland (1978), Griffin (1977), Griffin et al. (1982), Huang and 
Trotter (1980), and Johnson (1978). 


Application 6.18 (Shortest paths and network flows). The theory of blocking 
polyhedra yields an illustrative short proof of the Max-Flow-Min-Cut Theorem. 
Let D =(V, A) be a directed graph, and let r, s EV. Let c,,...,c¢,,ER4 be the 
incidence vectors of the r—s-paths in D. Similarly, Ict d,,...,d,ER4 be the 
incidence vectors of the r—s-cuts. 

Considering a given function /: A>Z, as a “length” function, one easily 
verifies: the minimum length of an r—s-path is equal to the maximum number of 
r—s-cuts (repetitition allowed) so that no arc a is in more than ((a) of these cuts. 
[Indeed, the inequality min = max is easy. To see the reverse inequality, let p be 
the minimum length of an r—s-path. For /=1,..., p, let 


V,:= {v EV |the shortest r—v-path has length at least i}. 


Then 6 (V,),...,6°(V,) are r—s-cuts as required.] This implies (6.11). Hence 
(6.12) holds, which is equivalent to the Max-Flow Min-Cut Theorem: the 
maximum amount of r—s-flow subject to a capacity function w is equal to the 
minimum capacity of an r—s-cut. (Note that ), 4,c, is an r—s-flow.) In fact, there 
exists an integral optimum flow if the capacitics are integer, but this fact docs not 
seem to follow from the theory of blocking polyhedra. 

The above implies that the polyhedra conv{c,,...,c,,}+R4 and 
conv{d,,...,d,}+ RA form a blocking pair of polyhedra. By (6.17), the 
polynomial-time solvability of the minimum-capacitated cut problem is equivalent 
to that of the shortest-path problem; note that this latter. problem is much easier. 


Application 6.19 (r-arborescence). Let D =(V, A) be a digraph and let rE V. Let 
C,,--.,C,, be the incidence vectors of r-arborescences, and let d,,...,d, be the 
incidence vectors of r-cuts (cf. Application 5.6). 

From (5.13) we know that (6.1) holds. Therefore, by Theorem 6.3, also (6.2) 
holds. It means that for any “capacity” function w € R4, the minimum capacity 
of an r-cut is equal to the maximum value of wz, +--- +, where w,,..., 6, 20 
are such that there exist r-arborescences 7,,...,7, with the property that for 
each arc a, the sum of the #, for which a & T;, is at most c,. 

Hence the convex hull of the incidence vectors of sets containing an r-cut as a 
subset, is determined by the system (in x © R“) 


(ij) O<x,51, aEA, 

(6.20) 

(ii) oF x,21, T r-arborescence . 
act 


Edmonds (1973) in fact showed that (6.20) is TDI (again, this does not seem to 
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follow from the theory of blocking polyhedra). It is equivalent to: the minimum 
size of an r-cut is equal to the maximum number of pairwise disjoint r-arboresc- 
ences. 

The theory of blocking polyhedra can also be applied to directed cuts and 
directed-cut covers (cf. Theorem 5.15). Again it follows that the convex hull of 
incidence vectors of sets containing a directed cut as a subset, is determined by 
(6.20), with “r-arborescence’’ replaced by ‘“‘directed-cut cover’. However, in this 
case the system is not TDI (cf. Schrijver 1980b, 1982, 1983a). 

Similar arguments apply to 7-joins and 7-cuts. 


7. Anti-blocking polyhedra 


The theory of anti-blocking polyhedra, due to Fulkerson (1971, 1972), is to a 
large extent parallel! to that of blocking polyhedra, and arises mostly by reversing 
inequality signs and by interchanging “min” and “max”. We here restrict 
ourselves to listing results analogous to those given in section 6, the proofs being 
similar. 

Let c,,...,¢,, d,,...,d,ER, be such that dim((c,,...,¢,))= 
dim((d,,...,d,) =n. Then the following are equivalent: 


(conv(c,,...5¢,,} + RO )YARY = (xERY |d/x <1 for j= Liceest}s 


(7.1) 
(conv{d,,...,4,} +R") AR" = {xER" [e'x <1 fori=1,...,m)}. 
(7.2) 
Define for any subset X of R" the anti-blocker A(X) of X by: 
A(X) := {x ER" | y'« <1 for each y in X}. 
Clearly, if 
P:=(conv{c,,...,¢,} TR" JAR" , (7.4) 
then 
A(P) = (x ER" |cix <1 fori=1,...,m)}. (7.5) 


A(P) is called the anti-blocking polyhedron of P. If R= A(P), the pair P, R is 
called an anti-blocking pair of polyhedra. Again, this is a symmetric relation: 


For any polyhedron P of type (7.4), A(A(P)) = P. (7.6) 


Each of the following are equivalent among themselves and to (7.1) and (7.2): 


(a) = The pair (conv{c,,...,¢,,} 4 R" AR" and (conv{d,,...,d,} +R") 
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NR", forms an anti-blocking sie of polyhedra; (7.7) 
(b) For each /ER": max{I"c,, seu nie 
=min{ay+--4A| A AER, SD ad ai}; (7.8) 
j 
(c) For each wER": max{w'd,,...,w'd,} 
= min] testy [oye sm, ERD we, zw} : (7.9) 


d i) dc, <1 for alli=1,....,mandj=1,...,¢, 
dd) (i) djc, d . J (7.10) 
(ii) max{l'c,,... 1c 


LweZ'. 


m 


} -max{w'd,, aos w'd} =I"w for all 


This last characterization is again due to Lehman (1965). 


. Application 7.11 (Perfect graphs), The theory of anti-blocking polyhedra yields a 
proof of Lovasz’s Perfect Graph Theorem (cf. chapter 4). This linc of proof 
was developed by Fulkerson (1970b, 1972, 1973), Lovasz (1972), and Chvatal 
(1975). 

Define for any graph G = (V, E), the stable-set polytope STAB(G) of G as the 
convex hull of the incidence vectors of stable sets in G. Clearly, any vector x in 
the stable-set polytope satisfies 


(i) «x, 20, veEV, 
(ii) xx, <1, K CV, K clique , 


veK 


(7.12) 


since the incidence vector of any stable set satisfies (7.12). Note that the polytope 
determined by (7.12) is exactly A(STAB(G)). The circuit on five vertices shows 
that generally A(STAB(G)) can be larger than STAB(G). Chvatal (1975) showed 
that STAB(G) is exactly determined by (7.12) if and only if G is perfect. 
Anti-blocking then yields the Perfect Graph Theorem. 

First observe the following. Let Ax <1 denote the inequality system (7.12) (ii). 
So the rows of A are the incidence vectors of cliques. By definition, G is perfect if 
and only if the (dual) linear programs 


max{w'x|x 20, Ax <1} =min{y"l| y=0, y'A>w'} (7.13) 
have integral optimum solutions, for each {0, 1}-vector w. 


Chvatal’s Theorem 7.14. G is perfect if and only if its stable-set polytope is 
determined by (7.12). 


Proof. (1) First suppose G is perfect. For w: V-> Z,, let @, denote the maximum 
weight of a stable set. To prove that the stable-set polytope is determined by 
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(7.12), it suffices to show that 
a, =max{w'x|x 20, Ax <1} (7.15) 


for each w: V->Z,. This will be done by induction on Uiey W,.  - 

If w is a {0, }}-vector, then (7.15) follows from the remark on (7.13). So we 
may assume that w, 22 for some vertex u. Let e,=1 and e,=0 if vu. 
Replacing w by w— e in (7. 13) and (7.15) gives, by induction, a vector y =0 so 
that y TAS (w —e)' and yll= a, _,. Since (w—e), 21, there is a clique K with 
yx > 0 and wE K. We may assume ¢ that oe <w—e. Denote a:= x*. 

Then a,_,<a,. For suppose a,_,=a,. Let S be any stable set with 
Yocs(W-a),=a,_,. Since a, =a ae KNs= %. On the other hand, since 
w~-asw~e<w, we know that ic. (we), =a,., and hence, by com- 
plementary slackness, |K QS|=1, which is a contradiction. 

Therefore, 


a, =1+a,_,=1+ max{(w—a)'x|x20, Ax <1} 
=max{w'x|x20, Ax <3}, (7.16) 


implying (7.15). 

(II) Conversely, suppose that the stable-set polytope is determined by (7.12), 

e., that the maximum in (7.13) is attained by the incidence vector of a stable set, 
for cach w EZ‘. To show that G is perfect it suffices to show that the minimum 
in (7.13) also has an integer optimum solution for each {0, 1}-valued w. This will 
be done by induction on YL cy W,- 

Let w be {0, 1}-valued, and let y be a, not necessarily integral, optimum 
solution for the minimum in (7.13). Let K be a clique with y, >0, and let a = x“ 
(we may assume a= w). Then the common value of 


max{(w — a)'x|x 20, Ax <1} = min{y'1| y 20, y'A=(w—a)"} 
(7.17) 


is less than the common value of (7. 13), since by complementary slackness, cach 
optimum solution x in (7.13) has a'x = 1. However, the values in (7.13) and 
(7.17) are integers (since by assumption, the maxima have integral optimum 
solutions). Hence they differ by exactly 1. Moreover, by induction the minimum 
in (7.17) has an integral optimum solution y. Increasing component y, of y by 1, 
gives an integral optimum solution of (7.13). 0 


Equivalent to Theorem 7.14 is: 
G is perfect @& STAB(G) = A(STAB(G)) . (7.18) 
Note that the stable-set polytope of G is determined by (7.12) if the stable-set 


polytope and the clique polytope of G form an anti-blocking pair of polyhedra. 
Here the clique polytope is the convex hull of the incidence vectors of cliques. 
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The theory of anti-blocking polyhedra then gives directly the Perfect Graph 
Theorem of Lovasz (1972): 


Lovasz’s Perfect Graph Theorem 7. 19. The complement of a perfect graph is 
perfect. 


Proof. If G is perfect, then STAB(G) = A(STAB(G)). Hence STAB(G) = 
A(A(STAB(G))) = A(STAB(G)). Therefore, G is perfect. O 


By (7.14), with the ellipsoid method, a maximum-weighted stable sct in a 
perfect graph G can be found in polynomial time if and only if a maximum- 
weighted clique in a perfect graph G can be found in polynomial time. Since the 
complement of a perfect graph is a perfect graph again, this would not give any 
reduction of one problem to another. 

However, an alternative approach does give a polynomial-time algorithm to 
find a maximum-weighted stable set in a perfect graph (Grotschel et al. 1981, 
1986, 1988). Let G = (V, E) be a graph, with V= {1,...,}, say. Consider the 
collection M(G) of all matrices Y = (y,,)/,.9 in RO'** satisfying 


(i) Y is symmetric and positive semi-definite, 
(ii) Yoo= 1, Yu =VYar C= 1,...,20;5 (7.20) 
(iii) yy, =O i 4), (JPEE. 


These conditions imply that M(K) is a convex set (not necessarily a polytope). 
Let TH(G) be the set of all vectors x € R” for which there exists a matrix Y in 
M(G) so that x; = y,, fori=1,...,2. So TH(G) is the projection of M(G) on the 
diagonal coordinates [excluding the (0, 0) coordinate}. 
Now TH(G) turns out to be an approximation of STAB(G), at least as good as 
A(STAB(G)), in the following sense: 


Theorem 7.21. STAB(G) C TH(G) C A(STAB(G)). 


Proof. The first inclusion follows from the fact that for each stable set SCY, the 
incidence vector y* belongs to TH(G), as it is the projection of the matrix Y in 
M(G) defined by: 


a if i, /ESU {0}, 
Jig = 


0 otherwise . a2) 


‘Yo sce the second inclusion, first note that trivially cach vector in TH(G) is 
nonnegative (since the diagonal of a positive semi-definite matrix is nonnegative). 
Ii next suffices to show: if x € TH(G) and u is the incidence vector of a stable set 
in G, then u'x <1. To prove this, let x be the projection of Y € M(G). Since Y is 
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positive semi-definite we know: 
+ I 
el -uy¥(_) =0. (7.23) 


As y,, = Oif {i,j} © E, and as u is the incidence vector of a clique K in G, (7.23) 
implies 


1-2>) yint Dy ¥, 20. (7.24) 
i€K 


i€K 


Since x, = Yj =Y,;. this implies u'x <1. 0 


Theorem 7.21 implies that if STAB(G) = A(STAB(G)), i.e., if G is perfect 
then STAB(G) = TH(G). Now any linear objective function w'x can be maxi- 
mized over TH(G) in polynomial time. This follows from the fact that any linear 
objective function can be maximized over M(G) in polynomial time, since we can 
solve the separation problem over M(G) in polynomial time. [The latter follows 
from the fact that we can test, for any given Y in R“*”“"'”), the constraints in 
(7.20) in polynomial time, in such a way that we find a separating hyperplane (in 
the space R“''""*) if Y does not belong to M(G).] 

So as a consequence we have: 


Theorem 7.25. There exists a polynomial-time algorithm finding a maximum- 
weight stable set in any given perfect graph. 


By symmetry, the same holds for finding a maximum-weight clique in a perfect 
graph. 


Application 7.26 (Matchings and edge-colorings). Vet for any graph G = (V, F), 
Pual(G) denote the matching polytope of G. By scalar multiplication, we can 
normalize system (5.28) determining P,(G) to . x20, Cx <1, for a certain 
matrix C (deleting the inequalities in (5.28) corresponding to U C V with [U| <1). 
The matching polytope is of type (7.4), and hence its anti-blocking polyhedron 
A(P,,,(G)) is equal to {z ER" | Dz <1}, where the rows of D are the incidence 
vectors of all matchings in G. So by (7.8), taking /=1: 


[(U)| 


max[ MCG). max TTT 


} = min{y"1] y=0, y'D=V}. (7.27) 


Here (U) denotes the collection of all edges contained in U. 

The minimum in (7.27) can be interpreted as the fractional edge-coloring 
number y*(G) of G. If the minimum is attained by an integral optimum solution 
y, it is equal to the edge-coloring number y(G) of G, since 

y(G) = min{y"1| y=0, y'D=1", y integral} . (7.28) 


By Vizing’s Theorem, y(G) = A(G) or ¥(G) = A(G) + 1 if G is a simple graph. If 


b 
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G is the Petersen graph, then A(G) = y*(G) = 3 while y(G) = 4. Seymour (1979) 
conjectured that for cach, possibly noneinple, graph one has y(G) = max{A(G) + 
1, 1y*(G))}. 


8. Cutting planes 


For any sct PCR", let the integer hull of P, denoted by P,, be 
P,:=conv{x |x € P,x integral} . 


Trivially, if P is bounded, then P, is a polytope. Meyer (1974) showed that if P is 
a rational polyhedron, then P, is a rational polyhedron again. 

Most of the combinatorial results given above consist of a characterization of 
the integer hull P, by linear inequalities for certain polyhedra P. For example, the 
matching polytope is the integer hull of the polyhedron determined by the 
inequalities (5.28) (i), (ii). For most combinatorial optimization problems it is not 
difficult to describe a set of linear inequalitics, determining a polyhedron P, in 
which the integral vectors are exactly the incidence vectors corresponding to the 
combinatorial optimization problem. Hence, P, is the convex hull of these 
incidence vectors. However, it is generally difficult to describe P, by linear 
inequalities (cf. section 9). 

The cutting-plane method was introduced by Gomory (1960) to solve integer 
linear programs. Chvatal (1973a) (and Schrijver 1980a, for the unbounded case) 
derived from it the following iterative process characterizing P,. 

Define for any polyhedron PCR": 

Pis= a) H,, (8.2) 


7 orattonal affine 
halfspace with /7DP 


where a rational affine halfspace is a set H:= {x|c'x <5}, with c€Q" (c #0) 
and 6 € @. Clearly, we may assume that the components of ¢ are relatively prime 
integers, which implies 


= {x|[c'x <[6]}. (8.3) 


This usually makes the set P’ casy to characterize. 
For instance, for any rational m <n matrix and b € Q” we have 


{x| Ax <b}' = {x|(u"A)x <|u"b| for all u€Q% with u'A integral} , 


{x|x 20, Ax <b)’ = {(x|x=0; [u'Alx <|u"b} for all ue O7} 
(here |-| denotes component-wise lower integer parts). 

The halfspaces H, (more strictly, their bounding hyperplanes) are called cutting 
planes. 

It can be shown that if P is a rational polyhedron, then P’ is also a rational 
polyhedron. Trivially, PC H implies P,C H,, and hence P, C P’. Now generally 
P"#*P’, and repeating this operation we obtain a sequence of polyhedra 
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P,P’, P", P",... , satisfying 
PDIP'DP"DP"D- DP. (8.5) 


Denote the (f+ 1)th set in this sequence by P“”. Then: 


Theorem 8.6. For each rational polyhedron P there exists a number t with 
PO=P,. 


A direct consequence applics to bounded, but not necessarily rational, 
polyhedra. 


Corollary 8.7. For each polytope P there exists a number t with P© = P,. 


Blair and Jeroslow (1982) (cf. Cook ct al. 1986b) proved the following 
generalization of Theorem 8.6. 


Theorem 8.8. For each rational matrix A there exists a number t such that for each 
column vector b one has: {x| Ax <b} = {x| Ax =b),. 


Hence we can define the Chvdtal rank of a rational matrix A as the smallest 
such number ¢. The strong Chvdtal rank of A then is the Chvatal rank of the 
matrix 


I 
-I 

Al: (8.9) 
—-A : 


It follows from Hoffman and Kruskal’s Theorem (cf. Theorem 4.1) that an 
integral matrix A has a strong Chvatal rank 0 if and only if it is totally 
unimodular. Similar characterizations for higher Chvatal ranks are not known. [n 
Examples 8.10 and 9.3 we shall see some classes of matrices with strong Chvatal 
rank 1. 

For more on cutting planes, see Jeroslow (1978, 1979), and Blair and Jeroslow 
(1977, 1979, 1982). 


Example 8.10 (The matching polytope). For any graph G =(V, E), let P be the 
polytope determined by (5.28) (i), (ii). So P, is the matching polytope of G. It is 
not difficult to show that P’ is the polytope determined by (5.28) (i)—(iii). Hence 
Edmonds’ Matching Polyhedron Theorem 5.33 is equivalent to asserting P’ = P,. 
So the matching polytope arises from (5.28) (i), (ii) by one “round” of cutting 
planes. 

It can be derived from Edmonds’ Matching Polyhedron Theorem that each 
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SS lalees Foils (8.11) 
ij 
i=l 


i 


i 


where A has order m Xn, has strong Chvatal rank at most 1. 


9. Hard problems and the complexity of the integer hull 


The integer hull P, can be quite intractable compared with the polyhedron P. This 
has been shown by Karp and Papadimitriou (1982), under the generally accepted 
assumption NP # co-NP. 

First note that the ellipsoid method (cf. section 3) can also be used in the 
negative: if NP #P, then for any NP-complete problem there is no polynomial- 
time algorithm for the separation problem for the corresponding polytopes. More 
precisely, if for each graph G=(V,E) &,, is a subset of P(E), and if Optimi- 
zation Problem 3.17 is NP-complete, then (if NPP) the Separation Problem 
3.18 is not polynomially solvable. 

In fact, Karp and Papadimitriou showed that for any class (¥,|G graph), if 
Optimization Problem 3.17 is NP-complete, and if NP #co-NP, then the class of 
polytopes conv{y" | F € ¥,,} has difficult facets, i.c., 


there exists no polynomial ® such that for each graph G = (V, £) 
and cach cE Z* and 6 € Q with the property that c'x <6 defines a 
facet of conv{y’ | F € ¥,,}, the fact that c'x <6 is valid for cach yx” 
with F€ &, has a proof of length at most &(|V| + |E| + size(c) + 
‘size(6 )). (9.1) 


The meaning of (9.1) might become clear by considering description (5.28) of the 
matching polytope: although (5.28) consists of exponentially many inequalities, 
each facet-defining inequality is of form (5.28), and for them it is easy to show 
that they are valid for the matching polytope. 

Another negative result was given by Boyd and Pulleyblank (1984): Ict, for a 
given class (¥,,;|G graph), for each graph G=(V,£) the polytope P, in R® 
satisfy (P,,),=conv{y"|F © ¥,} and have the property that 


given G = (V, E) andc ER’, find max{c'x |x € Pg} (9.2) 


is polynomially solvable. Then if Optimization Problem 3.17 is NP-complete and 
if NP #co-NP, then there is no fixed ¢ such that for each graph G, (Po) = 
conv{y* | FE ¥,}. 

Similar results hold for subcollections ¥, of A(V) and for directed graphs. See 
also Papadimitriou (1984) and Papadimitriou and Yannakakis (1982) for the 
complexity of faccts. 


Example 9.3 (The stable-set polytope). Let G=(V,E) be a graph, and let 
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STAB(G) be the stable-set polytope of G. Let P(G) be the polytope in RY 
determined by 

(i) «x, 20, veV, 

(ii) > x,<=l, KCV, K clique. 


veEK 


(9.4) 


So P(G) = A(STAB(G)) (cf. Section 7). 
Clearly, STAB(G) C P(G). In fact, since the integral solutions to (9.4) are 
exactly the incidence vectors of stable sets, we have 


STAB(G) = P(G), . (9.5) 


Chvatal (1973a, 1984) showed that there is no fixed ¢ such that P(G)” = P(G), 
for all graphs G (if NP #co-NP, this follows from Boyd and Pulleyblank’s result 
mentioned above), even if we restrict G to graphs with a(G) = 2. 

By Chvatal’s Theorem 7.14, the class of graphs with P(G), = P(G) is exactly 
the class of perfect graphs. In Example 8.10 we mentioned Edmonds’ result that if 
G is the line graph of some graph H, then P(G)’ = P(G),, which is the matching 
polytope of H. 

The smallest ¢ for which P(G)“’ = P(G), is an indication of the computational 
complexity of the stability number a(G). Chvatal (1973) raised the question of 
whether there exists, for each fixed f, a polynomial-time algorithm determining 
a(G) for graphs G with P(G)® = P(G),. This is true for ¢=0, ie., for perfect 
graphs (Grétschel et al. 1981). 

Minty (1980) and Sbihi (1978, 1980) extended Edmonds’ result of the polyno- 
mial solvability of the maximum-weighted matching problem, by describing 
polynomial-time algorithms for finding a maximum-weighted stable set in K, 
free graphs (i.c., graphs with no K, , as induced subgraph). Hence, by (3.9), the 
separation problem for stable-set polytopes of K,,-free graphs is polynomially 
solvable. Yet no explicit description of a linear inequality system defining 
STAB(G) for K,,-free graphs has been found. This would extend Edmonds’ 
description of the matching polytope. It follows from Chvatal’s result mentioned 
above that there is no lixed ¢ such that P(GY? = P(G), for all K, ,-free graphs 
(see Giles and Trotter 1981). 

Perhaps the most natural “relaxation” of the stable-set polytope of G = (V, E) 
is the polytope Q(G) determined by 


(i) «20, veV, 
Gi) x,tx, <1, f{v,w} Ek. 
Again, Q(G),=STAB(G). Since Q(G)D P(G), there is no ¢ with Q(G) = 


Q(G), for all G. It is not difficult to see that Q(G)’ is the polytope determined by 
(9.6) together with 


(9.6) 


re fea 
= jel B , Cis the vertex set of an odd circuit . (9.7) 


vec 


It was shown by Gerards and Schrijver (1986) that if G has no subgraph H which 


TERI apm 


- am 


ere 


qe es 
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arises from K, by replacing edges by paths such that each triangle in K, becomes 
an odd circuit in H, then Q(G)' = STAB(G). Graphs G with Q(G)’ = STAB(G) 
are called by Chvatal (1975) t-perfect. 

Gerards and Schrijver showed more generally the following. Let A = (a,,) be an 
integral m Xn matrix satisfying 


2 lal s2, i=1,...,m. (9.8) 
Fs 


Then A has strong Chvatal rank at most } if and only if A cannot be transformed 
to the matrix 


1100 
101 0 
1001 
0110 (9.9) 
0104 
00114 


by a scries of the following operations: deleting or permuting rows or columns, or 
multiplying them by ~1; replacing [| 4] by D — bc", where D is a matrix and b 
and c are column vectors. 

Chvatal (1973a) showed that for G = K,, the smallest t with Q(G) = STAB(G) 
is about logan. 

Chvatal (1975) observed that the incidence vectors of two stable sets C, C’ are 
adjacent on the stable-set polytope if and only if CAC’ induces a connected 
graph. For more on the stable-set polytope, see Fulkerson (1971), Chvatal 
(1973a, 1975, 1984, 1985), Padberg (1973, 1974, 1977, 1979), Nemhauser and 
Trotter (1974, 1975), Trotter (1975), Wolsey (1976b), Balas and Zemel (1977), 
Ikura and Nemhauser (1985), Grétschel et al. (1986), and Lovasz and Schrijver 
(1991). 


Example 9.10 (The traveling-salesman polytope). For any graph G =(V, E), the 
traveling-salesman polytope is equal to conv{y” | H CE, H Hamiltonian circuit}. 
As the traveling salesman problem is NP-complete, by Karp and Papadimitriou’s 
result, the traveling-salesman polytope will have ‘‘difficult” facets {cf. (9.1)] if 
NP #co-NP. 

Define the polyhedron PCR” by: 


(i) O<x,s1, e€E, 
(i) Dx,=2, vev, (9.11) 


e2v 


(iii) 2) x,22, UCV,3<|U|<|V|- 


e&d(U) 


Since the integral solutions of (9.11) are exactly the incidence vectors of 
Hamiltonian circuits, P, is equal to the traveling. salesman polytope. Note that the 
problem of minimizing a linear function c'x over P is polynomially solvable, with 
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the ellipsoid method, since system (9.11) can be checked in polynomial time [(iii) 
can be checked by reduction to a minimum-cut problem]. So if NP #co-NP, by 
Boyd and Pulleyblank’s result, there is no fixed ¢ such that P“ = P, for each 
graph G. 

The system (9.11), however, has becn useful in solving large-~ scale instances 
of the traveling salesman problem: for any c€@Q*, the minimum of c'x over 
(9.11) is a lower bound for the traveling salesman problem, which can be 
computed with the simplex method using a row-generating technique. This lower 
bound can be used in a “branch-and-bound” procedure for the traveling salesman 
problem. 

This approach was initiated by Dantzig et al. (1954, 1959), and developed and 
sharpened by Miliotis (1978), Grétschel and Padberg (1979a,b), Grdétschel 
(1980), Crowder and Padberg (1980), and Padberg and Hong (1980) (see 
Grétschel and Padberg 1985, and Padberg and Grdtschel 1985 for a 
survey). 

Grétschel and Padberg (1979a) showed that the diameter of the traveling- 
salesman polytope for G = K,, is equal to }n(m —3). They also proved that for 
complete graphs all inequalities in (9.11) are facet-defining. 

For more about facets of the traveling-salesman polytope, see Held and Karp 
(1970, 1971), Chvatal (1973b), Grétschel and Padberg (1975, 1977, 1979a,b), 
Maurras (1975), Grétschel (1977, 1980), Grétschel and Pulleyblank (1986), 
Groétschel and Wakabayashi (1981a,b), and Cornuéjols and Pulleyblank (1982). 

Papadimitriou and Yannakakis (1982) showed that it is co-NP-complete to 
decide if a given vector belongs to the traveling-salesman polytope. Moreover, 
Papadimitriou (1978) showed that it is co-NP-complete to check if two Hamilto- 
nian circuits H, H’ yicld adjacent incidence vectors (see also Rao 1976). 

On the other hand, Padberg and Rao (1974) showed that the diameter of the 
“asymmetric” traveling-salesman polytope (i.c., convex hull of incidence vectors 
of Hamiltonian cycles in a directed graph) is equal to 2, for the complete directed 
graph with at least six vertices. Grotschel and Padberg (1985) conjecture that also 
the “undirected” traveling-salesman polytope has diameter 2. 


Other hard problems 9.12. The following references deal with polyhedra associ- 
ated with further difficult problems. Set-packing problem: Fulkerson (1971), 
Padberg (1973, 1977, 1979), Balas and Z.cmel (1977), and Ikura and Nemhauser 
(1985). Set-covering problem: Padberg (1979), Balas (1980), and Balas and Ho 
(1980). Set-partitioning problem: Balas and Padberg (1972), Balas (1977), 
Padberg (1979), and Johnson (1980). Linear ordering and acyclic subgraph 
problem: Grotschel et al. (1984, 1985a,b), and Jiinger (1985). Knapsack problem 
and 0,1-programming: Balas (1975), Hammer et al. (1975), Wolsey (1975, 1976a, 
1977), Johnson (1980), Zemel (1978), and Crowder et al. (1983). Bipartite 
subgraph and maximum-cut problem: Grétschel and Pulleyblank (1981), 
Barahona (1983a,b), and Barahona et al. (1985). 

For more background information on hard problems, see Grdtschel (1977, 
1982). 


Polyhedral combinatorics 1699 
References ~ 
A 
e * 
Ahrens, J.H. oe, 
a 


[1981] A conjecture of #.D. Bolker, J Combin. Theory B 31, 1-8. 
Araoz, J. = 
[1973] Polyhedral neopolarities, Ph.D. Thesis (University of Waterloo, Waterloo). 
Araoz, J., J. Edmonds and VJ. Griffin 
{1983]  Polarities given by systems of bilinear inequalities, Math. Oper. Res. 8, 34-41. 
Balas, E. 
[1975] Facets of the knapsack polytope, Math. Programming 8, 146 164. 
{1977} Some valid inequalities tor the set partitioning problem, Ann. Discrete Math. 1, 13-47. 
{1980} Cutting planes from conditional bounds: a new approach to set covering, Math. Programming 
Stud. 12, 19-36. 
Balas, E., and A. Ho 
{1980} Set covering algorithms using cutting planes, heuristics and subgradient optimization: a 
computational survey, Math. Programming Stud. 12, 37 69. 
Balas, &., and M.W. Padberg 
[1972] On the set-covering problem, Oper. Res. 20, 1152-1161. 
Balas, U., and E. Zemel 
[1977] | Critical cutsets of graphs and canonical facets of set-packing polytopes, Math. Oper. Res. 2, 15-19. 
Balinski, M. 
{1983] Signatures des points extremes du polyédre dual du probléme de transport, C.R. Acad. Sci. Paris 
Sér. A-B 296, 457-459. 
[1984] The Hirsch conjecture for dual transportation polyhedra, Math. Oper. Res. 9, 629-633. 
Balinski, M.L. 
{1974} On two special classes of transportation polytopes, Math. Programming Stud. 1, 43- 58. 
[1985] Signature methods for the assignment problem, Oper. Res. 33, 527-536. 
Balinski, M.L., and A. Russakoff 
[1974] On the assigment polytope, SIAM Rev. 16, 516-525. 
{1984} Faces of dual transportation polyhedra, Math. Programming Stud, 22, 1-8. 
Barahona, F. 
[1983a] The max-cut problem on graphs not contractible to K,, Oper. Res. Lett, 2, 107 111. 
[1983b] On some weakly bipartite graphs, Oper. Res. Lett. 2, 239-242. 
Barahona, F., M. Grotschel and A.R. Mahjoub 
{1985} Facets of the bipartite subgraph polytope, Math. Oper. Res. 10, 340-358. 
Baum, S., and L.E. Trotter Jr 
(1977] Integer rounding and polyhedral decomposition of totally unimodular systems, in: Optimization and 
Operations Research, eds. R. Henn, B. Korte and W. Oetth (Springer, Berlin) pp. 15-23. 
Bertsekas, D.P. 
(1981] A new algorithm for the assignment problem, Math. Programming 21, 152-171. 
Birkhol¥, G. 
[1946] ‘Tres observaciones sobre cl algebra lineal, Rev. Fuc. Ci. Exactas, Puras y Aplicadas Univ. Nac. 
Tucuman, Ser. A §, 147~-15t. 
Blair, C.E., and R.G. Jeroslow 
{1977} The value function of a mixed integer program: 1, Discrete Math. 19, 121-138. 
{1979] The value function of a mixed integer program: II, Discrete Math. 25, 7-19. 
{1982} The value function of an integer program, Math. Programming 23, 237 273. 
Bland, R.G. 
{1978} Elementary vectors and two polyhedral relaxations, Math. Programming Stud. 8, 159--166. 
Bolker, #.D. 
[1972} Transportation polytopes, J Combin. Theory B 13, 25-262. 


1698 A. Schrijver 


Boyd, S.C., and W.R. Pulleyblank 
[1984] Facet generating techniques, to appear. 
Camion, P. 
{1965} | Characterizations of totally unimodular matrices, Proc. Amer. Math. Soc. 16, 1087-1073. 
Chvatal, V. 
{1973} Vdmonds polytopes and a hierarchy of combinatorial problems, Discrete Math. 4, 305 337. 
[1973b] Edmonds polytopes and weakly Hamiltonian graphs, Math. Programming 5, 29-40. 
[1975] On certain polytopes associated with graphs, J Combin. Theory B 18, 138-154. 
[1984] Cutting-plane Proofs and the Stability Number of a Graph, Report 84326-OR (Institut fir Operations 
Research, Universitat Bonn, Bonn). 
{1985} Cutting planes in combinatorics, European J. Combin. 6, 217-226. 
Cook, W. 
{1983] Operations that preserve total dual integrality, Oper. Res. Lett. 2, 31-35. 
{1986} On box totally dual integral polyhedra, Math. Programming 34, 48-61. 
Cook, W., L. Lovasz and A. Schrijver 
{1984} A polynomial-time test for total dual integrality in fixed dimension, Math. Programming Stud. 22, 
64 49. 
Cook, W,, J. Fonlupt and A. Schrijver 
{1986a] An integer analogue of Carathcodory’s theorem, J. Combin. Theory B 40, 63-70. 
Cook, W., A.M.H. Gerards, A. Schrijver and fi. Tardos 
[1986b] Sensitivity theorems in integer linear programming, Math. Programming 34, 251-264. 
Cornuéjols, G., and W.R. Pulleyblank 
[1982] The travelling salesman polytope and {0,2}-matching, Ann. Discrete Math. 16, 27 55. 
Crowder, H., and M.W. Padberg 
{1980} Solving large-scale symmetric travelling salesman problems to optimality, Management Sci. 26, 
495-509. 
Crowder, H., &.L. Johnson and M.W. Padberg 
{1983} Solving large-scale zero-one linear programming problems, Operations Res. 31, 803-834. 
Cunningham, W.H. 
[1979] Theoretical properties of the network simplex method, Math. Oper. Res. 4, 196-208. 
[1984] Testing membership in matroid polyhedra, J. Combin. Theory B 36, 161-188. 
Cunningham, W.H., and A.B. Marsh TIT 
[1978] A primal algorithm for optimal matching, in: Polykedral Combinatorics (dedicated to the memory 
of D.R. Fulkerson), eds. M.L, Balinski and A.J. Hoffman, Math. Programming Stud. 8, 50 -72. 
Dantzig, G.B. 
{1951a] Maximization of a linear function of variables subject to linear inequalitities, in: Activity Analysis 
of Production and Allocation, ed. Tj.C. Koopmans (Wiley, New York) pp. 339-347. 
{1951b] Application of the simplex method to a transportation problem, in: Activity Analysis of Production 
and Allocation, ed. T}.C. Koopmans (Wiley, New York) pp. 359-373. 
[1963] Linear Programming and Extensions (Princeton University Press, Princeton, NJ). 
Dantzig, G.B., D.R. Fulkerson and S.M. Johnson 
{1954] Solution of a large-scale traveling-salesman problem, Oper. Res. 2, 393 410. 
Dantzig, G.B., L.R. Ford and D.R. Fulkerson 
[1956] A primal-dual algorithm for linear programs, in: Linear Inequalities and Related Systems, eds. 
1.W. Kuhn and A.W. Tucker (Princeton University Press, Princeton, NJ) pp. 171-181. 
Dantzig, G.B., D.R. Fulkerson and S.M. Johnson 
{1959] Ona linear programming, combinatorial approach to the travelling-salesman problem, Oper. Res. 7, 
58-66. 
Dinits, E.A. 
[1970] Algorithm for solution of a problem of maximum flow in a network with power estimation, Soviet 
Math, Dokl. tt (1977) 1277-1280. 


a on eee nem a re le ORCL LIE TEI LACT IT ALTE CELL CCL LTTE D IDEA ELS ADIL LOA I A A 


) 
i 
! 
} 
i 


Polyhedral combinatorics 1699 


Edmonds, J. : 
{1965] Maximum matching and a polyhedron with 0, |-vertices, J. Res. Nat. Bur. Standards B 69, 125-130. 
[1967] Optimum branchings, J Res. Nat. Bur. Standards B71, 233-240. 
{1970} Submodular functions, matroids and certain polyhedra, in: Combinatorial Structures and Their 
Applications, ed. R. Guy (Gordon and Breach, New York) pp. 69-87. 
[1971] | Matroids and the greedy algorithm, Math. Programming 1, 127 136. 
[1973] Edge-disjoint branchings, in: Combinatorial Algorithms, ed. R. Rustin (Academic Press, New York) 
pp. 91-96. 
[1979] Matroid intersection, Ann. Discrete-Math. 4, 39-49. 
Edmonds, J, and R. Giles s 
(1977] A min-max relation for submodular functions on graphs, Ann. Discrete Math. 1, 185-204. 
{1984] Total dual integrality of linear inequality systems, in: Progress in Combinatorial Optimization, ed. 
WR. Pulleyblank (Academic Press, Toronto) pp. 117--129. 
Edmonds, J., and R.M. Karp 
{1972] Theoretical improvements in algorithmic efficiency for network flow problems, J. Assoc. Comput. 
Mach. 19, 248-264. 
Egervary, J. : 
[1931] Matrixok kombinatorius tuladonsagairol, Mat. Fiz. Lapok 38, 16-28. 
Ford Jr, L.R., and D.R. Fulkerson 
[1957] A simple algorithm for finding maximal network flows and an application to the Hitchcock problem, 
Canad. J. Math. 9, 210-218. 
Frank, A. 
[1980] On the orientation of graphs, . Combin. Theory B 28, 251-261. 
Frank, A., and E. Tardos 
[1984] Matroids from crossing families, in: Finite and Infinite Sets I, eds. A. Hajnal, R. Rado and V.T. Sos 
(North-Holland, Amsterdam) pp. 295-304. 
[1985] An application of simultaneous approximation in combinatorial optimization, in: 26th Annu. Symp. 
on Foundations of Computer Science (EEE Computer Society Press, New York) pp. 459. 463. 
Fulkerson, D.R. 
[1970a}] Blocking polyhedra, in: Graph Theory and its Applications, ed. B. Warris (Academic Press, New 
York) pp. 93~112. 
(1970b] The perfect graph conjecture and pluperfect graph theorem, in: Proc. 2nd Chapel Hill Conf. on 
Combinatorial Mathematics and Its Applications, cds. R.C. Bosc et al. (University of North Carolina, 
Chapel Hill, NC) pp. (71-175. 
(1971] Blocking anti-blocking pairs of polyhedra, Math. Programming 1, 168.194, 
[1972]  Anti-blocking polyhedra, J. Combin. Theory B 12, 50-71. 
[1973] On the perfect graph theorem, in: Mathematical Programming, eds. T.C. Hu and S.M. Robinson 
(Academic Press, New York) pp. 69 -76. 
{1974} Packing rooted directed cuts in a weighted directed graph, Math. Programming 6, \-13. 
Gerards, A.M.H., and A. Schrijver 
[1986} Matrices with the Edmonds—Johnson property, Combinatorica 6, 365-379. 
Ghouita-Houri, A. 
[1962] Caractérisation des matrices totalement unimodulaires, C.R. Acad. Sci. Paris 254, 1192-1194. 
Giles, FR. 
[1975] Submodular functions, graphs and integer polyhedra, Ph.D. Thesis (University of Waterloo, Waterloo, 
Ont.). 
Giles, FR., and W.R. Pulleyblank 
{1979} Total dual integrality and integral polyhedra, Linear Algebra Appl. 25, 191-196. 
Giles, R. 
{1978] Facets and other faces of branching polyhedra, in: Combinatorics I, eds. A. Hajnal and V.T. Sos 
(North-Holland, Amsterdam) pp. 401-418. 
Giles, R., and L.E. Trotter Jr 
[1981] On stable set polyhedra for K, 3-free graphs, £ Combin. Theory B 31, 313-326. 


1700 : A. Schrijver 


Goldfarb, D. 
[1985] Efficient dual simplex algorithms for the assignment problem, Math. Programming 33, 187-203. 
Gomory, R.E. 

[1960] Solving linear programs in integers, in: Combinatorial Analysis, eds. R. Bellman and M. Hall Jr 
(American Mathematical Society, Providence, RI) pp. 211--215. 

{1963] An algorithm for integer solutions to linear programs, in: Recent Advances in Mathematical 
Programming, eds. R.L. Graves and P. Wolfe (McGraw-Hill, New York) pp. 269-302. 

Griffin, V. 

(1977] | Polyhedral polarity, Ph.D. Thesis (University of Waterloo, Waterloo, Ont.). 
Griffin, V., J. Araoz and J. Edmonds 

[1982]  Polyhedral polarity defined by a general bilinear inequality, Math. Programming 23, 117 137. 
Grétschel, M. 

[1977]  Polyedrische Charakterisierungen kombinatorischer Optimierungsprobleme (Verlag Anton Hain, 
Meisenheim am Glan). 

[1980] On the symmetric travelling salesman problem: solution of a 120-city problem, Math. Programming 
Stud. 12, 61-77. 

[1982] Approaches to hard combinatorial optimization problems, in: Modern Applied Mathematics 
Optimization and Operations Research, ed. B. Korte (North-Holland, Amsterdam) pp. 437 -515. 

Grdtschel, M., and M.W. Padberg 

[1975] Partial lincar characterization of the asymmetric travelling salcsman polytope, Math. Programming 8, 
378-381. 

[1977]  Lineare Charakterisierungen von Travelling Salesman Problemen, Z. Oper. Res. 21, 33-64. 

[{1979a] On the symmetric travelling salesman problem I: Inequalities, Math. Programming 16, 265-280. 

[1979b] On the symmetric travelling salesman problem I: Lifting theorems and facets, Math. 
Programming 16, 281-302. 

{1985] Polyhedral theory, in: The Traveling Salesman Problem, A Guided Tour through Combinatorial 
Optimization, eds. E.L. Lawler, J.K. Lenstra, A.H.G. Rinnooy Kan and D.B. Shmoys (Wiley, 
Chichester) pp. 251-305. 

Grétschel, M., and W.R. Pulleyblank 
[1981] Weakly bipartite graphs and the max-cut problem, Oper. Res. Lett. 1, 23-27. 


{1986] Clique tree inequalities and the symmetric travelling salesman problem, Math. Oper. Res. 11, 537- 
569. 


Grétschel, M., and Y. Wakahayashi 
{198la] On the structure of the monotone asymmetric travelling salesman polytope 1: hypohamiltonian facets, 
Discrete Math. 34, 43-59. 
{1981b] On the structure of the monotone asymmetric travelling salesman polytope II: hypotraceable facets, 
Math. Programming Stud. 14, 77-97. 
Gritschel, M., 1. Lovasz and A. Schrijver 
[1981] | The ellipsoid method and its consequences in combinatorial optimization, Combinatorica 1, 169 
197. 
Grétschel, M., M. Jiinger and G. Reinelt 
[1984] A cutting plane algorithm for the linear ordering problem, Oper. Res. 32, 1195-1220. 
(1985a] On the acyclic subgraph polytope, Math. Programming 33, 1-27. 
(1985b] Facets of the linear ordering polytape, Math. Programming 33, 43 60. 
Grétschel, M., L. Lovasz and A. Schrijver 
[1986] Relaxations of vertex packing, J Combin. Theory B 40, 330-343. 
[1988] Geometric Algorithms and Combinatorial Optimization (Springer, Heidelberg). 
Grinbaum, B. a 
[1967] Convex Polytopes (Interscience-Wiley, London). 
Gupta, R.P. 
[1967] A decomposition theorem for bipartite graphs, in: Theory of Graphs, ed. P. Rosenstichl (Gordon and 
Breach, New York) pp. 135-138. 


A ee A CE CLL LALA LLL CALE LT NL LR 


as angen nn me oo cea epemenientrmainn yeti 
5 staat 


Polyhedral combinatorics 1701 


Hammer, P.L., £.L. Johnson and U.N. Pcled 

[1975] Facets of regular 0-1 polytopes, Math. Programming 8, 179-206. 
Hausmann, D., and B. Korte 

[1978] Colouring criteria for adjacency on 0-1 polyhedra, Math. Programming Stud. 8, 106 -127. 
Held, M., and R.M. Karp 

{1970] The traveling-salesman problem and minimum spanning trecs, Oper. Res. 18, 1138 1162. 

{1971} The traveling-salesman problem and minimum spanning trees: Part 1, Math. Programming 1, 6-25. 
Hitchcock, FL. : 

[1941] | The distribution of a product from several sources to numerous localities, J. Math. Phys. 20, 224-230. 
Hoffman, AJ. - 

[1960] Some recent applications of the theory of linear inequalitics to extremal combinatorial analysis, 
in: Combinatorial Analysis, eds. R. Bellman and M. Hall Jr (American Mathematical Society, 
Providence, RI) pp. 113-127. 

[1974] A generalization of max flaw-min cut, Math. Programming 6, 352-359. 

[1979] The role of unimodularity in applying linear incqualtities to combinatorial theorems, Ann. Discrete 
Math. 4, 73 84 

Hollman, A.J., and J.B. Kruskal 

[1956] Integral boundary points of convex polyhedra, in: Linear Inequalities and Related Systems, eds. 

H.W. Kuhn and A.W. Tucker (Princeton University Press, Princeton, NJ) pp. 223-246. 
Ifuang, II.-C., and L.E. Trotter Jr 
[1980] A technique for determining blocking and anti-blocking polyhedral descriptions, Math. Programming 
Stud, 12, 197-205. 
Hung, M.S. 
[1983] A polynomial simplex method for the assignment problem, Oper. Res. 31, 595-600. 
Hurkens, C.A.J. 
[1991] On the diameter of the edge cover polytope, J. Combin. Theory B 51, 271-276. 
Ikura, Y., and G.L. Nemhauser 

[1983] A Polynomial-time Dual Simplex Algorithm for the Transportation Problem, Tech. Report 602 
{School of Operations Research and Information Engineering, Cornell University, Ithaca, NY). 

[1985] Simplex pivots on the set packing polytope, Math. Programming 33, 123-138. 

Jeroslow, R. 
(1979] An introduction to the theory of cutting-planes, Ann. Discrete Math. 5,71 95. 
Jeroslow, R.G. 
[1978] Cutting-plane theory: algebraic methods, Discrete Math. 23, 121-150. 
Johnson, E.L. 
[1978] Support functions, blocking pairs and anti-blocking pairs, Math. Programming Stud. 8, 167-196. 
[1980] Subadditive lifting methods for partitioning and knapsack problems, .£ Algorithms 1, 75. 96. 
Jiinger, M. 
[1985]  Polyhedral Combinatorics and the Acyclic Subgraph Problem (Heldermann, Berlin). 
Karmarkar, N. 
[1984] A new polynomial-time algorithm for linear programming, Combinatorica 4, 373-395. 
Karp, R.M., and C.H. Papadimitriou 
[1982] On linear characterizations of combinatorial optimization problems, S/AM ... Comput. 11, 620--632. 
Khachiyan, L.G. 
{1979} A polynomial algorithm in linear programming, Soviet Math. Dokl. 20, 191-194. 
Klee, V., and D.W. Walkup 
[1967] The d-step conjecture for polyhedra of dimension d < 6, Acta Math. (Uppsala) 117, 53-78. 
Klee, V., and C. Witzgall 

[1968] Facets and vertices of transportation polyhedra, in: Mathematics of the Decision Sciences, Part I, eds. 

G.B. Dantzig and A.F. Veinott (American Mathematical Society, Providence, RI) pp. 257-282. 
Koopmans, Tj.C. 

[1948] Optimum utilization of the transportation system, in: The Econometric Society Meeting, Washington, 

DC, 1947, ed. D.H. Leavens, pp. 136-146. 


1702 A. Schrijver 


Larman, D.G. 
[1970] Paths on polytopes, Proc. London Math. Soc. (3) 20, 161-178. 
Lehman, A. 
[1965] On the width-length inequatity, Mimeographic notes. See: 1979, Math. Programming 7, 403-417. 
Lenstra, A.K., H.W. Lenstra Jr and L. Lovasz . 
[1982] Factoring polynomials with rational coefficients, Math. Ann. 261, 515-534. 
Lovasz, L. 
[1972] Normal hypergraphs and the perfect graph conjecture, Discrete Math. 2, 253-267. 
[1976] On two minimax theorems in graphs, J. Combin. Theory B 21, 96-103. 
(1977] Certain duality principles in integer programming, Ann. Discrete Math. 1, 363-374. 
(1979] Graph theory and integer programming, Ann. Discrete Math. 4, 141-158. 
Lovasz, L., and M.D. Plummer 
[1986] Matching Theory, Ann. Discrete Math. 29. 
Lovasz, L., and A. Schrijver 
[1991] Cones of matrices and setfunctions and 0,1 optimization, SIAM J Optim. 1, 166-190. 
Lucchesi, C.L., and DIL Younger 
[1978} A minimax relation for directed graphs, J. London Math. Soc. (2) 17, 369-374, 
Maurras, J.F. 
[1975] Some results on the convex hull of Hamiltonian cycles of symmetric complete graphs, in: 
Combinatorial Programming: Methods and Applications (Reidel, Dordrecht) pp. 179-190. 
Meyer, R.R. 
(1974] On the existence of optimal solutions to integer and mixed-integer programming problems, Math. 
Programming 7, 223-235. 
Miliotis, P. 
[1978] Using cutting planes to solve the symmetric travelling salesman problem, Math. Programming 15, 
177--188. 
Minkowski, H. 
{1896) Geometrie der Zahien, Lirste Lieferung (Teubner, Leipzig). 
Minty, GJ. 
{1980] On maximal independent sets of vertices in claw-free graphs, J, Combin. Theory B 28, 284--304. 
Motzkin, T.S. 
{1936]  Beitrdge zur Theorie der linearen Ungleichungen, \naugural Dissertation, Basel (Azriel, Jerusalem). 
Naddef, D. 
{1989] The Hirsch conjecture is true for (0, 1)-polytopes, Math. Programming 45, 109-110. 
Nemhauser, G.L., and L.E. Trotter Jr 
{1974] Properties of vertex packing and independence system polyhedra, Math. Programming 6, 48-61. 
[1975] | Vertex packings: structural properties and algorithms, Math. Programming 8, 232 248. 
Padberg, M.W. 
[1973] On the facial structure of set packing polyhedra, Math. Programming 5, 199-215. 
[1974] Perfect zero--one matrices, Math. Programming 6, 180-196. 
[1977] On the complexity of set packing polyhedra, Ann. Discrete Math. 1, 421-434. 
[1979] Covering, packing and knapsack problems, Ann. Discrete Math. 4, 265-287. 
Padberg, M.W., and M. Grétschel 
[1985] Polyhedral computations, in: The Traveling Salesman Problem, A Guided Tour through 
Combinatorial Optimization, eds. &.L. Lawler, J.K. Lenstra, A.H.G. Rinnooy Kan and D.B. Shmoys 
(Wiley, Chichester) pp. 307-360. 
Padberg, M.W., and S. Hong 
{1980} On the symmetric travelling salesman problem: a computational study, Math. Programming Stud. 12, 
78-107. 
Padberg, M.W., and M.R. Rao 


[1974] The travelling salesman problem and a class of polyhedra of diameter two, Math. Programming 7, 
32-45, 


them An et CN CLO! CLL A LL LR te AS CORRE A PEE ES! ERNE, RETA SO 


Polyhedral combinatorics 1703 


[1980] The Russian Method and Integer Programming, GBA Working Paper (New York University, New 
York). 
(1982] Odd minimum cut-sets and b-matchings, Math. Oper. Res. 7, 67-80. 
Papadimitriou, C.H. A 
{1978] The adjacency relation on the traveling salesman polytope is NP-complete, Math. Programming 14, 
312-324, ; 
{1984] Polytopes and complexity, in: Progress in Combinatorial Optimization, ed. W.R. Pulleyblank 
(Academic Press, Toronto) pp. 295-305. 
Papadimitriou, C.H., and K. Steiglitz 
[1982] Combinatorial Optimization: Algorithms and Complexity (Prentice-Hall, Englewood Cliffs, NJ). 
Papadimitriou, C.H., and M. Yannakakis 
{1982] The complexity of facets (and some facets of complexity), in: Proc. 14th Annu. ACM Symp. on 
Theory of Computing (ACM, New York) pp. 255-260. 
Pulleyblank, W.R. 
{1983] Polyhedral combinatorics, in: Mathematical Programming — The State of the Art, eds. A. Bachem, 
M. Grotschcl and B. Korte (Springer, Berlin) pp. 312 345. 
Rao, MLR. 
{1976] | Adjacency of the traveling salesman tours and 0-1 vertices, S/AM J. Appl. Math. 30, 191-198. 
Roohy-Laleh, E. 
{1981] smprovements to the theoretical efficiency of the network simplex method, Ph.D. Thesis (Carleton 
University, Ottawa). 
Saigal, R. 
{1969] A proof of the Hirsch conjecture on the polyhedron of the shortest route problem, SIAM J. Appl. 
Math. 17, 1232-1238. 
Sbihi, N. 
[1978] Etude des stables dans les graphes sans étoile, M.Sc. Thesis (Université Sci. et Méd. de Grenoble, 
Grenoble). 
{1980} Algorithme de recherche d’un stable de cardinatité maximum dans un graphe sans étoile, Discrete 
Math. 29, 53-76. 
Schrijver, A. 
[1980a] On cutting planes, Ann. Discrete Math. 9, 291-296. 
[1980b} A counterexample to a conjecture of Edmonds and Giles, Discrete Muth. 33, 213 214. 
{1981} On total dual integrality, Linear Algebra Appl. 38, 27-32. 
[1982] Min--max relations for directed graphs, Ann. Discrete Math. 16, 261-280. 
{1983a] Packing and covering of crossing families of cuts, 4. Combin. Theory B 35, 104-128. 
{1983b] Min-max results in combinatonal optimization, in: Mathematical Programming - The State of the 
Art, eds. A. Bachem, M. Grétschel and B. Korte (Springer, Berlin) pp. 439 500. 
{1986} = Theory of Linear and Integer Programming (Wiley, Chichester). 
Scymour, PD. 
{1977} The matroids with the max-How min-cut property, J. Combin. Theory B 23, 189-222. 
{1979} On multicolourings of cubic graphs and conjectures of Fulkerson and Tutte, Proc. London Math. 
Soc. (3) 38, 423-460. 
[1980] Decomposition of regular matroids, 4 Combin. Theory B 28, 305 359. 
Shor, N.Z. 
[1970a] Utilization of the operation of space dilatation in the minimization of convex functions, 
Cybernetics 6, 7-15. 
{1970b] Convergence rate of the gradient decent method with dilatation of the space, Cybernetics 6, 102-108. 
(1977] Cut-olf method with space extension in convex programming problems, Cybernetics 13, 94-96. 
Steinitz, E. 
(1916] Bedingt konvergente Reihen und konvexe Systeme (Schluss), J. Reine Angew. Math. 146, 1-52. 
Stoer, |. and C. Witzgall 
(1970) 9 Cunvexity and Optimization in Finite Dimensions 1 (Springer, Berlin). 


1704 A. Schrijver 


Trotter Jr, L.E. 

{1975] A class of facet producing graphs for vertex packing polyhedra, Discrete Math, 12, 373-388. 
Truemper, K. 

[1982] On the efficiency of representability tests for matroids, European J. Combin. 3, 275-291. 


[1990] A decomposition theorem for matroids. V. Testing of matrix total unimodularity, J. Combin, Theory 
B 49, 241--281. 


Weyl, H. 
{1935] Elementare Theorie der konvexen Polyeder, Comm. Math. Helv. 7, 290-306. 
Wolsey, L.A. 
{1975] Faces for a linear inequality in 0 -1 variables, Math. Programming 8, 165-178. 
{1976a] Facets and strong valid inequalities for integer programs, Oper. Res. 24, 367-372. 
[1976b] Further facet generating procedures for vortex packing polytopes, Math. Programming 11, 158-163. 
{1977] Valid inequalities and superadditivity for 0-1 integer programs, Math. Oper. Res. 2, 66-77. 
Yudin, D.B., and A.S. Nemirovskii 
{1976/1977] Evaluation of the informational complexity of mathematical programming problems, Matekon 
13(2), 3-25. 
[1977] Informational complexity and efficient methods for the solution of convex extremal problems, 
Matekon 13(3), 25-45. 
Zemel, E. 
[1978] Lifting the facets of zero-one polytopes, Math. Programming 15, 268-277. 
Zhu, Y.-J. 
{1963] Maximum number of iterations in the dual algorithm of the Kantorovic- Hitchcock problem in linear 
programming, Chinese Math. 3, 307-313. 


CHAPTER 31 


Tools from Linear Algebra 


C.D. GODSIL 


Department of Combinatorics and Optimization, University of Waterloo, Waterlou, Ont. N2L 3G1, 
Canada 


with an Appendix by 


L. LOVASZ 


Department of Computer Science, Yale University, New Haven, CT 06250, USA 


Contents 

1 EMPOUUCTION 35.5/05.85 ata cea) 38 SRS E EN eee ToGo ee OEE Pe oa Ok By dang a 1707 
2.. The: frank argument): 0.4 c0¢6 og eke eee tured Sle es EE Rg sare a8 Lae gable eon ae 1707 
3. Designs and Codes... ... 00 - cee eee tee en eee ene nets 1713 
4, Null designs... 0... ee cee eee ene ee eee tenn ents e ete 1721 
55 Walks:in:. praphsie. 2.252 cic ee vi naan tis sing $US Coy Fea ReMe One waa oe cae 1725 
6. Eigenvalue methods ...... 2... ec ee et ee ee tenn teee 1733 
7. Appendix: Random walks, eigenvalues, and resistance (L. Lovdsz).....---. Dishee diate eho Naga ts 1740 
References 


HANDBOOK OF COMBINATORICS 
Edited by R. Graham, M. Grotschel and L. Lovasz 
© 1995 Elsevier Science BV. All rights reserved 


1705 


Tools from linear algebra 1707 


1. Introduction 


Linear algebra provides an important collection of tools for the working combina- 
torialist. These have often been used to obtain the first, the most elegant, or the 
only proof of many significant results. Before | compiled this survey, my opinion 
was that this area consisted of a large collection of disparate “tricks”. | have since 
come around to the view that there is a small set of basic principles, perhaps not 
easily formalised, that underly most of the combinatorial applications of linear 
algebra. 

In writing this survey I have made no attempt to be exhaustive; indecd | should 
apologise in advance to each of my readers for leaving out their favourite example. 
The references provided are also far from complete, but should at least form a 
reasonable starting point for those wishing to learn more. 

The reader is hereby warned that, unless explicitly mentioned otherwise, all 
ranks, dimensions etc. arc over the rationals. The letters 7 and J will denote the 
identity matrix and the “all-ones” matrix respectively. Their order will be deter- 
mined by the context. 


2. The rank argument 


The best known application of linear algebra to combinatorics is the now standard 
proof of Fisher's incquality, namely that in any non-trivial 2-dcsign the number 
of blocks is at least as large as the number of points. This seems a good place 
for us to begin. We first need to set up some notation. A hypergraph H =(V,E) 
consists of a vertex set V and a collection E of subsets of V, which we call edges. 
We call H simple if there are no repeated edges and we say it is k-uniform, or just 
uniform, if each edge contains exactly k vertices. If each vertex of H lies in exactly 
r edges then H is r-regular, or simply regular. A simple 2-uniform hypergraph is 
better known as a “graph”. 

A t-design is a uniform hypergraph with the property that every subset of f 
vertices is contained in exactly A edges, for some constant A. Thus a 1-design is 
a A-regular uniform hypergraph. It is well known and simple to prove that any 
t-design is also an s-design, for all s less than or equal to t. A design is trivial 
if each edge contains all the vertices. For further background see A. Brouwer’s 
chapter on designs in this Handbook. Fisher’s inequality (Fisher 1940) asserts, in 
our notation, that every non-trivial 2-design has at least as many edges as vertices. 
To prove this using linear algebra requires the use of incidence matrices, and 
consequently another definition. 

The incidence matrix B = B(H) of a hypergraph is the 01-matrix with rows 
indexed by the vertices of H/, columns indexed by the edges, and with (B);; =1 
if and only if vertex i is contained in edge j. The rank of B can not be greater 


than either the number of rows or the number of columns of B. Thus we have the 
following. 
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Principle 2.1. Let H = (V, E) be a hypergraph with incidence matrix B. If the rows 
of B are linearly independent then \V| < |E|. 


This result is simultaneously too important, and too useful, to be termed a 
theorem. There is one problem remaining though: it is still up to us to determine 
the rank of the incidence matrix B. For an arbitrary large hypergraph this would 
normally be every bit as difficult as proving that |V| <|E] by any other means. 
What saves us is that, in many interesting cases, the rank of B(H) is more or less 
obvious due, for example, to some regularity in the structure of H. Thus in the 
case of 2-designs we find that the defining conditions imply that 


BB" =(r—A)I4+AJ, (2.1) 


where r and A are as above. If the block size k of the design is not equal to 
|V| then we must have r > A. Hence the right-hand side of (2.1) is the sum of a 
positive semi-definite matrix / and a positive definite matrix (r — A)/. It is therefore 
positive-definite and, with that, non-singular. Consequently the left-hand side of 
(2.1) is non-singular, implying that the rank of B is equal to the number of rows 
in B. This proves Fisher's inequality. (Note that the use of positive-definiteness in 
the above argument can be circumvented by explicitly computing the determinant 
of (r — A)I + AJ.) 

We note further that, if B has rank |V| then it contains a |V| x |V| submatrix 
with non-zero determinant. Given the definition of the determinant as a signed sum 
of products of entries of a matrix, we deduce that there is an injection @: V — E 
such that the edge (i) contains the vertex i, for all vertices i in H. This is a 
strengthening of the bald statement that u < e. If we replace the non-zero elements 
of B by distinct members from a set of algebraically independent numbers, we 
obtain a “generic” incidence matrix for H. The existence of a bijection of the type 
described is equivalent to requiring that the rank of this generic incidence matrix 
be equal to |V|. (For another, important, example of this type of argument, sec 
Stanley 1980.) 

Fisher’s inequality can be generalised in many ways. If we weaken our definition 


of 2-design by allowing the edges to contain differing numbers of vertices, we find 
that B satisfies the matrix equation 


BB’ =A+A/, (2.2) 


where A is a diagonal matrix with non-negative entries. The diagonal entries of A 
will be positive if, for each pair of vertices in H, there is an edge containing one 
but not the other. In this case the argument we used above still yields that B has 
rank equal to {V|, and hence that v < e. (This result is due to Majindar 1962, and 
De Caen and Gregory 1985 prove an even more general result using quadratic 
forms.) 

Another important generalisation of Fisher’s inequality arises if we introduce 
automorphism groups. Suppose that I is a group of automorphisms of our hyper- 
graph H. Then vertices and edges of H are partitioned into orbits by I’. If H is 
a 2-design or, more generally, if B has rank |V|, then the number of edge orbits 
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of I is as least as large as the number of vertex orbits. (If Fis the identity group 
then this is just Fisher’s inequality again.) This claim can be proved as follows. 
Let C,,...,C, denote the vertex orbits of IF’. Call two edges o and 7 equivalent 
if |JoNC;| = |7NC,| for all i. Clearly any two edges in the same edge orbit of I 
are equivalent. Let P be the k x uv matrix with ith row equal to the characteristic - 
vector of C; (viewed as a subset of V(H)). Then edges o and 7 are equivalent if 
and only if the corresponding columns of P B are equal. Hence the number of edge 
orbits of I’ is as least as large as the rank of PB. If B has rank |V| then x'PB =0 
if and only if x'P =0. As the rows of P are linearly independent it follows that 
x'™P =0 if and only if x = 0, ie., the rank of PB is equal to the number of rows 
of P. This proves our claim. 

The argument used in the last paragraph is sufficiently important to be worth 
formalising. Let H be an arbitrary hypergraph, let 7 be a partition of its vertex 
set and let p be a partition of its edge set. Define the characteristic matrix of a 
partition to be the matrix with the ith row equal to the characteristic vector of the 
ith cell (or-component) of the partition. (Thus, a Ol-matrix is the characteristic 
matrix of a partition of its columns if and only if the sum of its rows is the vector 
with all entries equal to 1.) Denote the characteristic matrices of 7 and p by P 
and R respectively. We call the pair (7, p) of partitions equitable if: 

(a) each edge in the jth cell of » contains the same number of vertices from the 
ith cell of 7, 

(b) each vertex in the ith cell of a is contained in the same number of edges 
from the jth cell of p. 

We see that (7,p) is an equitable partition of H if and only if (p, 7) is an 
equitable partition of the dual hypergraph. (This is obtained by swapping the roles 
of the vertices and edges in H — its incidence matrix is the transpose of that of H.) 


Lemma 2.2. Let m7 and p respectively be partitions of the vertices and edges of the 
hypergraph H. Then (7, p) is equitable if and only if there are matrices ® and WV 
such that PB = ®R and RB‘ = WP. ; 


Proof. This lemma is only a routine translation of the definition (into linear algebra 
terms). O 


If ® and ¥ exist as described then ®RR' = PBR" and VPP" = RB'P". Hence 
@RR' = Ppl yt, 


Thus ¥ is determined by #, and vice versa. Note that both PP? and RR" are 
diagonal matrices. We call the matrix ® the vertex quotient of B with respect to 
the given pair of partitions. 


Lemma 2.3. Let ® be a vertex quotient of the incidence matrix B with respect to 
the equitable pair of partitions (7, p). If the rows of B are linearly independent then 
the rank of ® is equal to the number of cells in 7, and so the number of cells of a 
is less than or equal to the number of cells of p. 
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Proof. We have 
rank(P) = rank(P B) = rank(®R) = rank(®), 


where the first and third equalities hold because the rows of P and R are linearly 
independent, while the second equality follows from Lemma 2.2. ° O 


Note that Lemma 2.3 is actually a generalisation of Principle 2.1, which we 
can recover by taking 7 and p to be the partitions with all cells singletons. One 
important consequence of this lemma is the fact that the number of point orbits of 
a collineation group of a projective plane is always less than or equal to the number 
of line orbits (Hughes and Piper 1973, Theorem 13.4). It is not difficult to extend 
Lemma 2.3 to infinite structures. (See Cameron 1976.) The notion of quotient is 
useful because it provides a means of arguing that a particular matrix ® has rank 
equal to the number of rows in it. (Namely, @ has inherited this property from the 
larger matrix B.) Thus quotients extend the applicability of the rank argument. 
They will also play an important role in our section on eigenvalue methods. The 
definitions above have been chosen with this later usage in mind as well. 

¥ should also mention that it is often convenient to view ® as a generalised 
incidence matrix for the “quotient hypergraph” with the cells of a7 and p of H as 
its vertices and edges. (A cell of 7 is incident with a cell of p whenever some vertex 
in the former is contained in some edge of the latter.) In the case when a and p 
are the vertex and edge orbits of a group of automorphisms of H, Lemma 2.3 is 
well known and can be stated in a sharper form. See, e.g., Dembowski (1968, p. 
22) and Stanley (1982, Lemma 9.1). 

The next result is of fundamental importance, and underlies many combinatorial 
applications of linear algebra. 


Theorem 2.4. Let Q be a set with cardinality n and let B be the incidence matrix 
for the hypergraph #4 with the k-sets of Q as its vertices and the I-sets as its edges. 
Then if k < min{l,n — 1}, the rows of B are linearly independent. 


Here a k-set is incident with an /-set if it is contained in it. The earliest proof of this 
known to the writer appears in Gottlieb (1966). Other proofs appear in Foody and 
Hedayat (1977), Kantor (1972) and Graham et al. (1980). It can dlso be derived 
by a quotient argument. For suppose that we have a non-zero vector x such that 
x'™B = 0. We may assume without loss that the first entry of x is non-zero; in fact 
we assume that it is equal to 1. Clearly Sym(#1) acts as a group of automorphisms 
of H, Let F be the subgroup of Sym(n) fixing the first k-subset of Q. Thus I is 
isomorphic to Sym(kK) x Sym(a - A). Let ? and & respectively be the characteristic 
matrices for the partitions determined by the orbits of F on k- and /-subsets of £2. 
Finally let ® be the corresponding quotient of B. It is important to note that ® 
is a triangular matrix of order (kK + 1) x (k +1) with non-zero diagonal entries. In 
particular, it is invertible. 
Hey € J Iet xy be the vector with (xy); = X,,. We set 


1 
y= Tal baer 


yor 
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As x; = 1 and ly = 1 for all elements y of I, it follows that y 40. It is not too 
hard to show that (xy)"B = 0 if and only if x7B = 0. This implies that y'B = 0. 
Now the entries of y are constant on the orbits of T and so there is a non-zero 
vector z such that y? = z'™P. Then we have 


0=x"B =y"B=z'PB=z'OR 


and, since the rows of R are linearly independent, this implies that z'@ = 0. Since 
® is invertible this implies that z = 0 and so we are forced to conclude that there 
is no non-zero vector x satisfying x'B = 0, i.e., that the rows of B are linearly 
independent. 

We note some simple applications of Theorem 2.4. Suppose that 2 is the edge 
set of a complete graph on n vertices. Then a k-subset of is a graph on n 
vertices with k edges. The symmetric group Sym() acts on the (5) elements of 
2. The orbits of k-subsets correspond to the isomorphism classes of graphs on 
n vertices with k edges. Since the incidence matrix for k- versus /-subsets of 22 
has linearly independent rows, so does its quotient with respect to Sym(n). If g,., 
denotes the number of isomorphism classes of graphs with n vertices and k edges, 
it follows that g,, <g,, Whenever k < min{/, (5) —/}.-We deduce from this that 
the sequence g,,,,k = 0,..., (5) is unimodal. Perhaps a more significant application 
is the following. Let p,;(n} denote the number of partitions of the integer n into 
at most k parts, the largest of which is at most J. 


Lemma 2.5. The sequence p,)(n),n =0,..., kl is unimodal. 


Proof. We can define the wreath product I’ = Sym(/)tSym(k) to be the group 
acting on an k x / array R of “squares” by permuting the / squares in each row 
independently, and by permuting the k rows amongst themselves without changing 
the order of the squares in the rows. (So the order of I is (/!)*k!.) Then p,;(n) is 
the number of orbits under I formed by the subsets of n squares from R, i.e., it 
counts the “J’-isomorphism” classes of n-subsets of R. The lemma now follows as 
above. 


Lemma 2:5 is quite important and has a quite interesting history. The details of 
this, together with the above proof, will be found in Stanley (1982). The numbers 
Py,(n) arise in a remarkable variety of situations, occurring in particular as the 
coefficients in the expansion of the qg-binomial coefficients lily in powers of q. 
(For information on these see the chapter by Gessel and Stanley.) Although the 
sequence they form is unimodal, it is not log-concave. This means that some of the 
standard techniques for proving that a sequence is unimodal cannot be applied to 
derive Lemma 2.5. 

Stanley (1985) has also used quotienting by orbits to re-derive Lovasz’s proof 
that any graph with more edges than its complement can be reconstructed from its 
edge-deleted subgraphs. This will be discussed, along with some generalisations, 
in the next section. 

Recently, Wilson (1990) has determined the rank modulo p of the incidence 
matrix B of k-sets versus /-sets. In this paper he also gives a diagonal form for B, 
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1.e., a diagonal matrix D such that B = EDF for suitable integral matrices F and 
F with determinants cqual to one. This is a very interesting result, but it seems 
fair to say that we do not yet know what its combinatorial significance is. 

The theory of posets is an area where linear algebra has been effectively applied, 
and it would be remiss of us not to consider some examples. Let P be a poset with 
n elements. The incidence matrix B = B(P) is the 01-matrix with ij-entry equal 
to 1 if and only if i <j in P. We can always assume that B is an upper triangular 
matrix; this is equivalent to the existence of linear extensions of P. Since i < i, 
each diagonal entry of B is equal to |, and so B is an invertible matrix. This means 
that there is little future in seeking to prove results about posets by rank arguments 
based on B. In fact we are going to work with the inverse of B. 

This brings us to the Mobius function of P. This is the function p = pp on 
P x P defined by 


ei, j) = (BO');,. 


(For the basic theory of the Mobius function, expressed in a manner consistent 
with our approach, see Lovasz 1979a, chapter 2. Another convenient reference is 
Aigner 1979.) Note that p(i,i) = 1 for all elements i of P. It is also not difficult 


to prove that pz (i, /) = 0 unless i < j in P. However, the key fact that we need is 
the following. 


Lemma 2.6. Let f be a function on the poset P. If the function f* is defined by 
fo = Vf) 
j2i 


then 


f= Tahar 
i 


Proof. If we view f and f* as column vectors with entries indexed by the ele- 
ments of P then the first equation asserts that f* = Bf. Hence f = B-'f*, which 
is equivalent to the second equation. © 


The theory of the Mébius function consists of an interesting mixture of combi- 
natorics, algebra and topology, and is very well developed. Explicit expressions for 


bp are known for many important posets. We will be making use of the following 
pretty result. 


Lemma 2.7 (Wilf 1968, Lindstré6m 1969). Let P be a lattice with n elements, let f 
be a function on P and let F be the n x n matrix such that (F),; = f@V j). Then 


det F = [icp f° @), where f* (i) = Vice HG DFU). 


Proof. Let ® be the diagonal matrix with ith diagonal entry equal to f*(i). Then 
it is easy to see that 


(B®B"),, = S> f(k) = » f'(k) 


kai kziv} 


ees. 


and that, by 
Therefore F- 
det(F) =a 

since det(B) = 1. TREE 
With the help of this. ’ 
nants. For examples, thes 
original paper of Wilf 
tion complexity, see the 
to combinatorial optimisationz 
result shows yet another use. 
Theorem 2.8. Let P be a lattice sua 
a@ permutation v7 of the elements of P 


Proof. Define a function g on P by 
1, iff=1, 
ie { ifi=1 


Q, otherwise 


and let G be the matrix with ij-entry equal to}, 
From Lemma 2.6 we see that g(i) = 25); f(/ rt : 
we deduce that det G = |] ;-p w(j,1). Our hypo. ..s concerning y thus forces 
conclusion that detG # 0. The assertion of the theorem now follows from the | 
definition of the determinant as a signed sum of products of elements of a matrix. 


a] 
Theorem 2.8 was first obtained by Dowling and Wilson (1975) using linear alge- 
bra, but not Wilf’s lemma. (The above proof might even be new.) Many interesting 
lattices have MObius functions that satisfy the hypothesis of the theorem. In par- 
ticular, geometric lattices have this property and so have “complementing permu- 
tations” as described. From this it follows very quickly that every finite geometric 
lattice has at least as many hyperplanes as points. We cannot resist the following 
remarks in this context. Let H be a hypergraph with the property that any two 
distinct vertices lie in exactly one edge. Then it can be shown that the vertices 
and edges of H form a geometric lattice. Consequently such a hypergraph has at 
least as many edges as vertices. This indicates that there is a non-trivial connection 
between Theorem 2.8 and Fisher’s inequality. 
To complete this section we mention another important result in the theory of 
posets which has been established using linear algebra, namely the proof of Ivan 
Rival’s conjecture on modular lattices, by Kung (1985, 1987). 


3. Designs and codes 


We introduce a framework which will enable us to derive some far-reaching gen- 


eralisations of Fisher’s inequality, and a number of other results. Our approach 
follows the exposition in Godsil (1993, chapters 14-16). 
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A separation function p ona sct 2 is simply a function on 2 x (2 taking values 
in some field, the reals unless otherwise notified. If f is a real polynomial with 
degree r and a € 2 then we call the mapping 


x» f(p(a,x)) 


a zonal polynomial of degree at most r, and denote it by f,. We inductively de- 
fine vector spaces Pol(2,r) as follows. We set Pol(2,0) equal to the span of the 
constant functions on 2 and Pol(Q,1) equal to the span of the zonal polynomials 
f,, where f ranges over all real polynomials of degree at most one and a over the 
points in Q. If r > 1 then Pol(.Q,r) is the space spanned by 


{fg: f € Pol(Q,1), g € Pol.O,r —1)}. 
We also define 
Pol(2) = (_J Pol(,r). 


rouy 


We refer to the elements of Pol({2) as polynomials on 2, and a polynomial which 
lies in Pol(Q,r), but not in Pol(Q,r — 1), will be said to have degree r. Note that 
if f is a polynomial of degree r on 2 and g is a polynomial of degree s then the 
product fg will be a polynomial of degree at most r + s. (Note also that x? + y? + 2? 
is a polynomial of degree zero on the unit sphere in R?.) 
A polynomial space consists of a set 2, a separation function p on Q and an 
inner product (-,-) on Pol(2) such that the following axioms hold: 
(I) If x,y € Q then p(x, y) = ply, x). 
(II) The dimension of Pol(Q, 1) is finite. 
(U1) (f,g) = (1, fg) for all f and g in Pol(22). 
(IV) If f € Pol(Q2) and f(x) > 0 for all x in Q, then (1, f) > 0, with equality if 
and only if f = a. 
These axioms are not very restrictive. Moreover, when 2 is finite, Axioms (II) 
and (IV) are redundant. We now present a number of examples. In all of the cases 
where £2 is finite the inner product is given by 


U8) = id Soret). 
xen 

(a) The Johnson scheme J(n,k). 
Here 2 is the set of all k-subsets of a set of n elements and p(x, y) := |x N y|. For 
this scheme we will usually assume implicitly that 2k <n. 

(b) The power set 2". 
In this case 22 is the power set of a finite set with 7 elements, and p(x, y) := |xN y| 
once again. 

(c) The Hamming scheme H(n, q). 
Let & be an alphabet of g symbols {0,1,...,q — 1}. Define 9 to be the set 3” of 
all n-tuples of elements of 3, and let p(x, y) be the number of coordinate places 
in which the n-tuples x and y agree. Thus n — p(x, y) is the Hamming distance 
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between x and y. (We note that //(m,2) and 2" have the same underlying set, 
but the functions p are different.) We do-not require q to be prime power. The 
elements of H(n,q) are usually called words over ¥. 

(d) The symmetric group Sym(n). ; 
We set 2 = Sym(). If x and y are elements of /N then p(x, y) is the number of 
points left fixed by the permutation x 'y, Note that we can view Sym() as a 
subset of H(n,n), and that the function p on Sym(m) is then just the restriction of 
the corresponding function in H(n,n). 

(e) The Grassmann scheme J,(n,k). 
This time | is the set of all k-dimensional subspaces of an n- -dimensional vector 
space over a field with g elements, and p(U,V) is the number of {-dimensional 
subspaces of UNV. 


(f) The unit sphere in R". 
The set 2 is formed by the unit vectors in R” and p(x, y) is the usual inner product 
on R". In this case the clements of Pol({Q) are precisely the polynomials in n 


variables, restricted to the sphere. If f and g are two elements of Pol(2) then 
their inner product is 


(f.g) = [ feen, 


where yz is the usual measure on the sphcre in R", normalised so that the sphere 
has measure I. 

(g) Perfect matchings in K,. 
Hf x and y are perfect matchings in K,,, then p(x,y) is the number of edges they 
have in common. 


Let (2, p) be a polynomial space. If @ is a finite subset of and f and g are 
polynomials on 2 then we define 


f.8)0 = a S- f(x)g(x). 


xed 


We call ® a t-design in 2 if 
,fe=(/ 


for all f in Pol(,t). A t-design in the Johnson scheme is a simple t-design, as 
defined in section 2. A t-design in the Hamming scheme is the same thing as a 
“simple” orthogonal array. (These claims are not trivial; a proof of the first and 
an outline of a proof of the second can be found in Godsil 1988.) A t-design 2 
in the power set of X can be shown to be a collection of subsets of X such that, 
for all s < t, each set of s points lies in the same number of elements of 9. For 
the unit sphere, our definition of a t-design is the usual one. (Delsarte et al. 1977 
study t-designs on the unit sphere at some length.) 

These examples show that t-designs in polynomial spaces are objects of interest, 
and indicate the importance of the following result. 
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Theorem 3.1. Let (22, p) be a polynomial space. If ® is a t-design in 2 then |P| > 
dim(Pol(, |t/2])). 


Proof. Let 5;; be the Kronecker delta function and let A,,...,4,, be an orthonor- 


mal basis for Pol(, |t/2}). (Such a basis can always be found by Gram-Schmidt 
orthogonalisation.) Then 


(hy he > (Ll, hho — (1, h;h;) =e (h;, hj) = 6;;- 


Therefore the restrictions to ® of the polynomials A; form a linearly independent 
set of functions on ®. Since the vector space of all functions on ® has dimension 
|®|, it follows thatn <|®|. oO 


For this result to be useful, we need to know the dimensions of the spaces 
Pol(Q,r). This is a non-trivial task, but the answer is known in many cases. 
(Again, see Godsil 1993 for the details.) For the Johnson scheme J(n, k) we have 
dim(Pol(Q, r)) = (7) when r <k. 


Corollary 3.2 (Ray-Chaudhuri and Wilson 1975). Let Q be a 2s-design formed 
from the k-subsets of an n-set, with 2k <n. Then & contains at least (") blocks. 


If (Q, p) is the Hamming scheme H(n, q) then dim(Pol(2,r)) is equal to 


Ya - VF ("). 


i<r 


Corollary 3.3. Let D be an orthogonal array of strength 2s in the Hamming scheme 
H(n,q). Then 


ial > Da -1("). 


i<s 


The dimension of dim(Pol({2,r)) is the same for the power set of an n-set as 
it is for the Hamming scheme H(n,2). For the g-Johnson scheme Pol(2,r) has 
dimension Cle and for the unit sphere in R" it has dimension (""~') + ("*7)”). 
(This lower bound on the size of a spherical t-design was derived by Delsarte et 
al. 1977.) 

A 2s-design realising the bound of Theorem 3.1 is called a tight design. A tight 
2-design in the Johnson scheme is better known as a symmetric design; such designs 
may be said to be rather plentiful. On the other hand it has been proved (Bannai 
1977) that if ¢ = 2s > 10 then there are only finitely many tight ¢-designs in the 
Johnson scheme. There is also a close connection with the theory of association 
schemes; we will discuss this briefly following Corollary 3.9. 

Our definition of a design in a polynomial space can be extended. A weighted 


t-design on a polynomial space (2, p) is a non-negative function @ with finite 
support, S say, such that 


VL eMf@) =A 


xeS 
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for all polynomials f in Pol(2,t). For example, if ® is a t-design we might take 
¢ to be the function equal to 1/|| on the elements of ® and zero elsewhere. A 
weighted design in the Johnson scheme is equivalent to a design in the usual sense 
of the word, with repeated blocks permitted. Theorem 3.1 can be easily extended to 
show that, if S is the support of a weighted ft-design, then |S| > dim(Pol(Q, |1/2])). - 
It can also be shown, under fairly gencral conditions, that a polynomial space con- 
tains weighted t-designs supported by at most dim(Pol(2,t)) points. (See Godsil 
1988, 1993.) We give a simple and direct proof of this fact for the Johnson scheme. 


Lemma 3.4. For any integers t, k.and v witht <k <u —k, there is a k-uniform 
hypergraph H with at most (7) edges that is the support of a weighted t-design. 


Proof. Let X be a fixed set of v elements, and let B,, be the 01-matrix with rows 
indexed by the t-subsets of X, columns indexed by the k-subsets and with ij-entry 
equal to 1 if and only if the ith t-subset is contained in the jth k-subset. A weighted 
t-design corresponds to a column vector x of length (;) with non-negative entries 
such that 


Byyt =} (3.1) 


We know that (3.1) does have non-negative solutions — Cac j, for example. 
Hence, by standard results in the theory of linear programming, (3.1) has non- 
negative basic solutions, i.e., solution vectors supported by linearly independent 
sets of columns of B = B,,. Such a set of columns has cardinality at most (/), since 
this is the number of rows of B. O 


Here we should also mention Wilson’s well-known proof that weighted t-(u, k, A) 
designs exist whenever the obvious divisibility conditions are satisfied (Wilson 
1973), which also starts with eq. (3.1). 

There is another lower bound on the size of a t-design which, despite its simple 
proof, is very useful. 


Theorem 3.5. Let ® be a t-design in the polynomial space (22, p). Then, for any 
polynomial p of degree at most t which is non-negative on ® and any point a in ®, 


and equality holds if and only if p vanishes on ®\ a. 


Proof. Let ¢ be a weighted t-design and let a be a point in its support. Suppose 
that p is a polynomial of degree at most t on 2, and that p is non-negative on the 
support of y. Then 


e(a)p(a) << S> g(x)p(x) = (1p), 
x: g(x) 70 


from which our bound follows immediately. © 
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Theorem 3.5 is a form of Delsarte’s linear programming bound. (Sce, e.g., Del- 
sarte et al. 1977.) The name arises because this theorem suggests the following 
optimization problem: choose p in Pol{{2,1) non-negative on & so that p(a)/(1, p) 
is maximal. This is easily expressed as a linear programming problem. 

Let A be a set of real numbers. (In all cases of interest, it will be finite.) A 
A-code in a polynomial space (2, p) is a subset ® such that 


{p(x y): ye Bx Ay} CA. 


We will also refer simply to codes when the set A is determined by the context, 
or is not important. We say ® has degree d if it is a A-code for some set A 
of cardinality d. Many interesting problems in combinatorics are equivalent to 
questions concerning the maximum cardinality of A-codes. We have a general 
upper bound on the cardinality of codes, -but to state this we require another 
definition. Suppose p is a separation function on a set 2 and ® C 22. We say ® is 
orderable if there is linear ordering “<” such that, whenever a € ®, 


p(a,a) € {p(a,x): x < a}. 


If ® is an orderable subset then so is any subset of it. In all the examples of poly- 
nomial spaces we listed, (2 itself was orderable. The following result is therefore 
significant. 


Theorem 3.6. Let p be a separation function on the set Q and let ® be an orderable 
subset of 2Q with degree s. Then 


|B| < dim(Pol(,s)). 


Proof (We only give an outline, see Godsil 1993, Theorem 14.4.1, for more details). 
For each a in @ let A(a) be the set 


{p(a,x): p(x,x) < pla,a),x £ a} 
and let F,, be the polynomial on 2 defined by 
F,(x) = {I (pla, x) i, A). 


Ac Aa) 


Then F,,(b) = 0 if b <a and F,(a) 4 0. Using this it is not difficult to show that 
the functions F, are linearly independent. Since they also all lie in Pol(Q,s), the 
result follows. 


The basic technique used in proving Theorem 3.6 is due to Koornwinder (1976). 
We now list some of the consequences of Theorem 3.6. A set of degree s in the 
unit sphere is usually called an s-distance set. 


Corollary 3.7 (Delsarte et al. 1977). If ® is an s-distance subset of the unit sphere 
in R" then |O| < ("') + ("). 


x ol 
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Corollary 3.8 (Ray-Chaudhuri and Wilson 1975). Let H be a k-uniform hyper- 
graph on v vertices and let A be a set of positive integers with \A| = d. Then if 
H isa A-code, |E(H)| < ((). 


Corollary 3.9 (Frankl and Wilson 1981). Let H be a k-uniform hypergraph on v 
vertices and let A be set of positive integers. Suppose that A has d' distinct elements 
modulo the prime p, and none of these is congruent to k modulo p. Then if H is a 
A-code, \E(H)| < (;). 


Corollary 3.10 (Frankl and Wilson 1981). Let ¥ be a subset of the power set of X, 
where |X| =n. If ¥ has degree s then |F| < Vic, (7). 


More information about the above results will be found in chapters 14 and 24. 
The paper by Frankl and Wilson (1981) contains many significant results, one of 
which was recently used in Kahn and Kalai (1992) to disprove Borsuk’s conjecture. 
(This asserted that a set of diameter one in R“ could always be partitioned into 
d+ 1 sets of diameter smaller than onc. Kahn and Kalai show that at least (1.1)v4 
such sets may be required.) Many of the polynomial spaces we have mentioned 
are association schemes. Delsarte (1973) showed how to define designs and codes 
in association schemes; where these concepts overlap ours, they agree. Further 
information will be found in chapter 15. 

A number of interesting results of coding type have been proved using exterior 
algebra. The basic example is the following, which is a slight extension of a result 
due to Bollobds (1965). The version stated, and its proof, are due to Lovasz (1977). 


Theorem 3.11. Suppose that A,,...,A,, are r-element subsets of a set X, and 
B,,...,B,, are s-element subsets of X. If A; B; =9 for all i and A,B; #9 
whenever i < j, thenm < ('"’). 


Proof. Let f be a mapping from X into V = R’** such the image of any set of 
r+s distinct points from X is linearly independent. (We could assume that f maps 
each element of X to a vector of the form 


(1,t,...,¢7° !). 


It is a simple exercise to show that this works, provided only that we use distinct 
values of the parameter ¢ for distinct elements of X.) 
To any set S of elements of X we associate the wedge product 


VAT ei tat, 
Afe 


xe 


and we denote this by w(S$). (This product does dep 
the multiplication is performed, but a change of orde 
sign, and this will cause no problems.) Observe that t 


dimension (sp) and it is non-zero if and only if the 


linearly independent. If T is a second subset of X the; 
and only if f(S U T) spans a subspace of V with dim 
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The m vectors w(A;) lie in a vector space of dimension (">"); if we can show they 
are linearly independent then the theorem is proved. Suppose we have scalars c; 
such that 


ar 


S “ce; @(A,) = 0. (3.2) 
i=! 


Let j be the greatest index such that c; #0. Since B;™ A; is nonempty for all i 
less than j, we have w(A;) A w(B;) = 0 if i < j. Since B, NA; = 4, it follows that 
f(A; U B;) isa linearly independent sect. Hence w(A;) A w(B;) # 0. Therefore (3.2) 
yields 


0 = °¢, w(A;) Aw(B;) 
it 


= So, @(A;) A w(B,) 


iz] 


=¢ w(A,) A w(B;). 


But this implies that c; = 0, and this forces us to conclude that the vectors w(A,) 
are linearly independent. Hence m < (‘(*). OG 


A subspace U of V = R’* with basis v,,...,v,, can be represented by the vector 
A; ¥;- Hence the argument used above yields the following result, 


Lemma 3.12 (Lovdsz 1977). If we are given r-dimensional subspaces U;,..., U,, 
and s-dimensional subspaces W,,..., W,, of V = RB such that U, WW, # Oifi <j 
and U;N W; =0 then m < ("*). 


The theorem itself is a consequence of this lemma, together with the observation 
that there is an injection of X into V which maps all subsets with cardinality at 
most r +s Onto independent sets. In fact the lemma holds independently of the 
dimension of V. For suppose we have subspaces U; and W, as described in a vector 
space V, where dim(V) >r+s. Since we can extend the field we are working 
over if necessary, there is no loss in assuming it is infinite. Choose a subspace 
V, of V with codimension r +s in general position with respect to the subspaces 
U; and W,, and let @ denote the mapping onto the quotient space V/V. Then 
dim(U; N W;) = dim(@(U;) O @(W;)) for all ¢ and j and we can now apply the 
lemma to the subspaces ¢(U;) and #(W,), 1 < i,j <m, of the vector space V/Vp. 
(One consequence of this is that Theorem 3.11 actually holds if the A; and B; are 
flats of rank r and s respectively in a linear matroid.) 

More examples of the use of exterior algebra will be found in Lovasz (1977, 
1979c) and Alon (1985). One possible source for background on exterior algebra 


is Northcott (1984), but any book on multilinear algebra would suffice for what 
we have used. 
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4. Null designs 


Let V be a fixed set with v elements. A function f on the subsets of V is a null 
design of strength t (or a null t-design) if, for each subset + of V with at most ¢ 
elements, Sy Aner 


S> (8) = 0. (4.1) 


B2r 


If U is a subset of V then the restriction of f to the subsets of U is not, in general, 
a null design of strength ¢ on U. However, there is an casy way to construct such 
a function from f, due to Frank] and Pach (1983), that we now describe. 

Given any function f on the subsets of V, define the function f* by setting 


fa) =>" f(B). (4.2) 


Bot 


Then f is a null ¢-design if and only if f* vanishes on the subsets of V with at most 
t elements. Also f can be recovered from f* by Mobius inversion thus: 


AB) = OE-DF APs). (4.3) 


TB 


Consequently we can construct a null ¢-design on the subset U of V as follows. 

(a) Choose a null t-design f on V. 

(b) Compute the transform f* as in (4.2) above. 

(c) Apply Mobius inversion on the subsets of U (as in (4.3)) to the restriction 
(f*){U of f* to U. 

Let us denote the resulting function by f,,. We can view it as a null design on V 
by the simple expedient of defining it to be zero on any subset of V not contained 
in U, 

There is a possibility that f,, may be identically zero, but this will not happen 
unless f* vanishes on all subsets of U. We have 


fula= >> C-) lp(p) 


aCBCU 


= aye ys ty) 


aCBCU y2B 


=) fay oF ene 


yCV aCBCynU 


=F fd) (4.4) 


yOu «@ 


which provides a useful alternative definition of f,,. One consequence of (4.4) is 
that if fy(a) 4 0 then f(y) 4 0 for some subset y of V such that UM y = a. We 
also obtain the following result. 
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Lemma 4.1. Let f be a null design of strength t on the set V and let U be a minimal 
subset of V such that f*(U) #0. Then if a CU, 


fu(e) = (-DIM"'f(U). 
Proof. This follows immediately from the definition of fy. O 


Corollary 4.2. Any non-zero null design of strength t on the set V assumes a non- 
zero value on at least 2'*' subsets of V. 


Proof. Let U be a minimal subset of V such that /*(U) # 0. Since f has strength 
1, the cardinality of U is at least ¢ + 1. By the lemma, f,, is non-zero on each subset 
of U and so, by the remark above, for each subset a of U, there must be a subset y 


of V such that yM U = @ and f(y) # 0. This supplies us with 2!! distinct elements 
of V on which f is non-zero. 


Let G be the incidence matrix for the subsets of a v-set with cardinality at most 
t, versus all subsets of the same v-sct. Then a null f-design can be viewed as an 
element of the null-space of G, and so Corollary 4.2 can viewed as determining 
the minimum distance of a code over the rationals. If we had worked modulo 2 we 
would have obtained a Reed-Muller code. The minimum distance of these codes 
has been determined, and is given in most textbooks on coding theory. (See chap- 
ter 16 in this Handbook or, for example, MacWilliams and Sloane 1978, chapter 
13.) The arguments used to determine this minimum distance actually suffice to 
determine the minimum distance over the rationals. Hence we may view the above 
corollary as a translation of a known result. Corollary 4.2 is also derived, in another 


context, in Anstee (1985, Proposition 2.5). We now present some applications of 
this machinery. 


Lemma 4.3 (Frankl and Pach 1983). If H, and #1, are two distinct t-designs with 


the same vertex set then the symmetric difference of their edge sets contains at least 
2'"! edges. 


Proof. Let y, and x, be the respective characteristic vectors of 1, and H). Then it 


is not difficult to check that y, -- x, is a null design of strength 1. By Corollary 4.2 
it must have at least 2't' non-zero entries. 


Our next application of Corollary 4.2 requires some further preliminaries. A 
hypergraph H, is an edge-reconstruction of the hypergraph H, if there is a bijec- 
tion @ from E(/1,) to E(H,) such that, for each edge e in //,, the edge-deleted 
hypergraph H, \e is isomorphic to H,\ (e). We say that a hypergraph H is edge- 
reconstructible if any hypergraph that is an edge reconstruction of is isomorphic 
to it. Thus we can say that a hypergraph is edge reconstructible if it is determined by 
the collection of its edge deleted hypergraphs. The edge reconstruction conjecture 
for graphs asserts that all graphs with at least four edges are edge-reconstructible. 
Bondy and Hemminger (1977) provide an excellent, if slightly dated, survey of 
‘progress on the reconstruction problem. 

A hypergraph is s-edge reconstructible if it is determined by the collection of (§) 
hypergraphs obtained by deleting, in turn, each set of s edges from it. The next 
result generalises the result of Miiller (1977) on edge reconstruction of graphs. 
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Lemma 4.4, Let H be a hypergraph with v vertices and e edges. If 2°~* > v! then 
H is s-edge reconstructible. 


Proof. Assume by way of contradiction that H, and H, are two non-isomorphic 
hypergraphs with e edges, and the same collection of s-edge deleted hypergraphs. 
There is no loss of generality in assuming that H, and H, have the same vertex 
set V. We view a hypergraph with vertex set V as a subset of the power set 2” of 
V.Ifi=1 or 2, let x; be the function on the 2” defined by 


1, if P&H; 
xP) = 45 ts 


0, otherwise. 


I claim that the function 
x= |Aut(H,)[x, — |[Aut()) |x. 


is a null design with strength e - s on 2”. For if L is any hypergraph with vertex 
set V andi = | or 2 then 


S> [Aut(A,)Ixi(F) 


FOL 


is equal to the number of permutations 7 of V such that the image of H,; under 
7 contains L, and this is in turn equal to the number of sub-hypergraphs of H; 
isomorhic to L. The claim that y is a null design with strength e — s is consequently 
a restatement of the hypothesis that /7, and H, have the same s-edge deleted sub- 
hypergraphs. 

It follows that y must take non-zero values on at least 2° °*' hypergraphs. But 
|Aut(H;)|x; is equal to 1 on each of |Sym(V)|/|Aut(H,)| hypergraphs with vertex 
set V that are isomorphic to H; (i = 1,2), and is equal to zero on all others. Thus 


it takes non-zero valucs on at most 2{Sym(V){ = 2u! hypergraphs. This means that 
we must have 2° *<v! 


Let B be the incidence matrix of hypergraphs with e — s edges versus hyper- 
graphs with e edges (and all having vertex set V). If y is a non-zero null design 
with Strength e — s then By = 0. Hence the columns of 8 must be lincarly depen- 
dent. From Theorem 2.4 it follows that in this case B must have more rows than 
columns. So if y exists as described then 


2 DY: 
(25) > (2): 
e-s e 
which implies that e — s < 2° ~ e. Thus we have deduced the following. 


Lemma 4.5, Let H be a hypergraph with vu vertices and e edges. If 2e > 2° + then 
H is s-edge reconstructible. 


When s = 1 this result was first proved in Lovasz (1972), using an inclusion- 
exclusion argument. A proof using a form of quotient argument was subsequently 
presented in Stanley (1985). The argument just used is easily modified to prove that 
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a k-uniform hypergraph on vu vertices with e edges is s-edge reconstructible if 2e > 
(;) +s. On the other hand Lemma 4.4 holds as stated for k-uniform hypergraphs. 
For graphs, the analogues of Lemmas 4.4 and 4.5 were first proved in Godsil et al. 
(1987). 

So far, all our applications of the theory of null designs have used only Corol- 
lary 4.2. We now give an example where Lemma 4.1 is used. A hypergraph is 
k-chromatic if we can partition its vertex set into k classes such that no edge is a 
subset of any one of the classes. It is critically k-chromatic if it is k-chromatic and 
each of the subgraphs obtained by deleting one edge from it is (k — 1)-chromatic. 
Thus the cycle on five vertices is an example of a critically 3-chromatic 2-uniform 
hypergraph. The result we are about to prove, due to Lovdsz (1976), asserts that 
any critically 3-chromatic k-uniform hypergraph with v vertices has at most (;) 
edges. This is an immediate byproduct of the following. 


Lemma 4.6 (Lovasz 1976). Let H he a critically 3-chromatic k-uniform hypergraph 
with vertex set V and let B = B, ,(11) be the incidence matrix for the (k — 1) subsets 
of V versus the edges of H. Then the columns of B are linearly independent. 


Proof. Assume by way of contradiction that the columns of B are linearly depen- 
dent. Then there is a null design f of strength (k — 1) on V that is supported by 
the edges of H. Thus f*, as defined by eq. (4.2) above, vanishes on all subsets of V 
with fewer than k elements. Since f itself vanishes on all subsets of V with more 
than k elements, it follows from (4.2) that f = f*. 

Now let (X, Y) be any partition of V into two classes. Then, from (4.4) we have 


fx@)= S> fO. 
ti] 


ynx= 


Since f = f* it follows from this that the above sum is equal to 7). y f’(y) and, 
given that f'(y) 4 0 only when y € £(/1), we thus deduce that 


fx) = (-1)* fy (0). 


Using (4.4) once more we obtain 


> £8) =(-1k S f(B). (4.5) 


BOX=0 BOY-@ 


To complete the proof we choose an edge a of H such that f(a) #0 and take 
(X,Y) to be a 2-colouring of H \a. Then @ is the unique edge of H contained in 
one of the sets X and Y. This implies that one side of (4.5) is zero, but the other 
is not. Accordingly f cannot exist as described, and so the columns of B,_,(H) 
are linearly independent. O 


The above proof is no simpler than the original, and differs from it only in the ar- 
gument used to derive (4.5). However, it does show how the available information 
on null designs can be used. There is a closely related result due to Seymour. 
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Lemma 4.7 (Seymour 1974). The rows of the incidence matrix of a critically 3- 
chromatic hypergraph are linearly independent over R. 


Proof. Let H be a critically 3-chromatic hypergraph with incidence matrix B. As- 
sume by way of contradiction that there is a non-zero vector y such that y’B = 0. , 
The hypergraph induced by the vertices i such that y; = 0 is 2-colourable. Assume 
that it has been coloured blue and red. Extend this to H by colouring the vertices 
j such that y; > 0 with blue, and the remaining vertices red. If b is a column of B 
then y'b = 0. Hence either y; = 0 for all vertices i in the edge corresponding to b, 
or else y is positive on one vertex of this edge and negative on another. This shows 
that our colouring of the vertices of /1 is a proper 2-colouring, which contradicts 
our choice of H. O 


This proof is interesting in that it depends on the fact that R is an ordered field. 
No other example of this comes to mind. The Fano plane shows that the result is 
not valid over finite fields. 

We remark finally that there is a close connection between the theory of null 
designs and the representation theory of the symmetric group. The key to this is 
that we may identify a k-subset of a v-set with a “tabloid” having two rows, of size 
v —k and k. (As ever, we assume 2k < v.) Then the null designs with minimum 
support constructed in Frankl and Pach (1983) can be viewed as “polytabloids”, 
which span a Specht module for the symmetric group. For more information on 
the latter see, e.g., James (1978, chapter 4). 


5. Walks in graphs 


In the previous sections our emphasis has been on design theory, but from now it 
will be on graphs (and directed graphs). We begin by establishing some notation. 
An edge {u,v} in a graph will be regarded as being formed from the two arcs (u, v) 
and (uv,u). (This usage of the term “arc” is also standard in other situations, e.g., 
when discussing automorphism groups of graphs.) Hence we may, when convenient, 
view a graph as simply a special type of directed graph. If D is a directed graph 
with vertex set V then its adjacency matrix A(D) is the matrix with rows and 
columns indexed by V, and with uwv-entry equal to the number of arcs in D from 
u to v. (Our directed graphs may have loops and/or parallel arcs, however our 
graphs will always be simple.) Note that isomorphic directed graphs will not in 
general have the same adjacency matrices but, as will become apparent, this is 
never the source of any problems. 
A walk in a directed graph is a sequence 


U9. Cty V1y-+ +9 Un 19 ns Un 


formed alternately of vertices and arcs, such that e; is the arc (u,_,,0,;). The length 
of the above walk is 7. We explicitly permit walks of length zero; there is one such 
walk for each vertex. A walk that starts and finishes at the same vertex is called 
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closed. All walks, even in undirected graphs, are directed objects. The basic result 
concerning walks can now be stated. 


Lemma 5.1. Let D be a directed graph with adjacency matrix A. If u and v are 
vertices of D then (A‘),,, is equal to the number of walks of length k in D that start 
at u and finish at v. 


The proof of this result is a routine induction argument, based on the observation 
that A* = AA‘-', One consequence of this result is that tr A* is equal to the 
number of closed walks in D with length k. (And since A® = J, we thus reconfirm 
that there is one closed walk of length zero on each vertex of D.) We note also that 
if D is a graph then tr A = 0, tr A? equals twice the number of edges in D and tr A? 
is equal to six times the number of 3-cycles. Given the existence of fast algorithms 
for matrix multiplication, the last observation leads to the most efficient known 
algorithm for detecting a triangle. This also works when D is directed, provided we 
first delete all the loops from it. (This approach to finding 3-cycles has occurred 
independently to a number of people, so I remain silent on the question of its 
attribution. The efficiency of such a “non-combinatorial” algorithm is undoubtedly 
a source of annoyance in many quarters.) 

The most effective way to study walks in graphs is by using generating functions. 
To describe this we first need another round of definitions. Let D be a directed 
graph with adjacency matrix A. The walk generating function of D is 


W(D,x):= (I xA) | = So xtAl, 
k>0 


Thus W(D, x) is a formal power series with coefficients in a ring of matrices. The 
uv-entry of W(D, x) will be written as W,,,(D, x) and the trace of W(D, x) will be 
denoted by C(D, x). (As we have no intention of ever setting x equal to a real 
or complex number in one of these series, the reader should put all thoughts of 
convergence from her or his mind.) The characteristic polynomial det(xI — A) of 
A will be denoted by #(D, x) and referred to as the characteristic polynomial of 
D.\f u € V(D) then D \u is the directed graph obtained by removing u, together 
with all its attendant arcs. Convenient references for background information on 
adjacency matrices and related topics are Biggs (1993) and Cvetkovic et al. (1980). 
Walk generating functions are studied at some length in Godsil (1993, chapter 4). 


Lemma 5.2. Let u be a vertex in the directed graph D. Then 
x 'W,,(D,x ') = 6(D\u,x)/o(D,x). 


Proof. Let B be the adjacency matrix of D\u. From Cramer’s rule and the defini- 
tion of W(D, x), we sce that W,,,,(D, x) = det(/ -- xB)/det(I -- xA). (Remark: the 
two identity matrices / in this quotient have different orders. We will frequently 
be found guilty of this abuse of notation.) If n = |V(D)| then 


det(I — xA) = x" det(x"'f — A) = x"$(D, x7!) 
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and similarly det(x/ — B) = x""'@(D\u,x~'). The lemma follows immediately. 
O 


The above lemma provides an explicit expression for the diagonal entries of 
W(D, x). We derive some analogous formulas for the off-diagonal elements later. 
We note one simple but useful property of the characteristic polynomial. For the 
proof see, for example Cvetkovié et al. (1980, Theorem 2.14) or Godsil (1993, 
Theorem 2.1.5(c)). 


Lemma 5.3. For any directed graph D, 


$'(D,x)= SY) o(D\u,x). 


ueV(D) 


As an immediate consequence of I-emmas 5.2 and 5.3, we infer that 
x 'C(D,x') = 6'(D,x)/o(D,x). (5.1) 


This shows that the characteristic polynomial and the closed walk generating func- 
tion of a directed graph provide the same information. If we multiply both sides 
of (1) by @(D, x) and then equate coefficients, we recover a system of equations 
connecting the sums of the powers of the zeros of ¢(D, x) with its coefficients. 

The concept of quotients, as introduced in section 2, can be applied very usefully 
to graphs and directed graphs. It was first studied by H. Sachs; a discussion of it 
from his point of view is presented in Cvetkovié et al. (1980, chapter 4). Here we 
will only consider quotients of graphs, a more extensive treatment of this topic is 
given in Godsil (1993, chapter 5). One definition is necessary. If G is a graph then 
a partition a of V(G) will be called equitable if the pair of partitions (7, 7) is 
equitable in the sense used in section 2. We have the following. 


Lemma 5.4. Let G be a graph and let m be a partition of V(G) with characteristic 
matrix P. Then wr is equitable if and only if there is a matrix ® such that P A(G) = 
@P. 


Here @ is a square matrix with rows and columns indexed by the cells of 7 and 
with (®),; equal to the number of arcs that start at a vertex in cell i and finish on 
a given vertex in cell j. Thus if G is Petersen’s graph, u is a fixed vertex in G and 
a is the partition of V(G) induced by the distance in G from u then 


01 0 
®=|]3 01 
02 2 


? 


which illustrates that ® need not be symmetric. We shall find it convenient to view 
@ as the adjacency matrix of a directed graph with the cells of a as its vertices. 
This directed graph will be denoted by G/7. The following result can now be 
derived in a routine fashion. 
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Lemma 5.5. Let a be an equitable partition of the graph G and set ® = A(G/7). 
Then (#);; is equal to the number of walks of length k in G/m that start in cell i 
and finish at a specified vertex in cell j. 


The discrete partition, with each cell a singleton, is always equitable. Consequently 
Lemma 5.5 is a generalisation of the better known Lemma 5.1. The last two results 
provide all the information on quotients that we need. 

One consequence of Lemma 5.4 is that the characteristic polynomial of G/a 
divides that of G. To see this note first that if U is an invariant subspace for A 
then we have PU = PAU = ®PU, showing that PU is an invariant subspace for 
®. From this, and the fact that the rows of P are linearly independent, it can be 
shown that the characteristic polynomial of ® divides that of A. In one important 
case we can compute $(G, x) from ¢(G/7, x). 


Theorem 5.6. Let G be a graph with n vertices, let u be a vertex in G and let w be 
an equitable partition of G in which {u} is a cell. Then if @6(G\v,x) is the same 
for all vertices v in G and H = G/n, 


$'(G,x)/b(G, x) = nb(H \ {u}, x) /b (IE, x). (5.2) 


Proof. Let C,,...,C, be the cells of a and denote the corresponding vertices of H 
by 1,...,7. Assume that C, = {uw}. From Lemma5S.5 we see that if the vertex v of G 


is in cell C,; then W,,,(G, x) = W,,(H, x). The result now follows from Lemmas 5.2 
and 53. O 


It is not difficult to show that, when {1} is a cell 7, (®*),,/(@*),; = {C;|. Thus, 
under the hypotheses of Theorem 5.6, we can compute $(G, x) given ® = A(#Z). 
The most obvious case where this result can be applied is when Aut(G) is vertex 
transitive and 7 is the partition of V(G) formed by the orbits of a subgroup 
of Aut(G) that fixes the vertex u. The next result is one of the most important 
applications of the theory we have described. 


Corollary 5.7. Under the hypotheses of Theorem 5.6, the numerators in the partial 
fraction expansion of no@(H \ {u},x)/@(H,x) are the multiplicities of the zeros of 


$(G,x). 


Proof. This is a well-known property of the partial fraction expansion of 
p'(x)/p(x), for any polynomial p(x). 


Corollary 5.7 thus provides a feasibility condition that a digraph A must satisfy 
to occur as the quotient with respect to an equitable partition a of a graph G, for 
which the conditions of Theorem 5.6 hold. This condition can be formulated in 
a number of different ways, and is often referred to as the “eigenvalue method”. 
The key idea is that the multiplicities of the eigenvalues of A(G) can be deter- 
mined from a fairly limited amount of information. There are surprisingly many 
situations where this is useful. The “classical” application is in demonstrating the 


ib iat sla ate sa cpanned 


Tools from linear algebra ~ : 1729 


non-existence of classes of, or individual, distance-regular graphs. The most well- 
known, and earliest example, is provided by the work of Hoffman and Singleton 
(1960) on Moore graphs of diameter two and three. (A convenient description 
of their work, and more recent generalisations, will be found in Biggs 1993.) For 
another application we mention the proof of the fact that finite projective planes 
cannot have a null polarity, as presented in Hughes and Piper (1973), and the gen- 
eralisation of this result to the so-called “friendship theorem”. (For more details 
and further references, see Cameron and Van Lint 1991, p. 45.) This method has 
also recently been applied in model theory (Evans 1986), albeit at a point where 
the distinction between this subject and finite geometry is hard to discern. Finally, 
McKay (1979) has used Theorem 5.6 and Corollary 5.7 to determine, with the aid 
of a computer, all vertex-transitive graphs with fewer than 20 vertices. 

Our approach to Corollary 5.7 is not the standard one, which is based on com- 
putations with the eigenvectors of @ = A(A), and places much more restrictive 
conditions on G (namely that it should be a distance-regular graph). An accessible 
discussion from this viewpoint is presented in Biggs (1993). A detailed exposition 
along the lines taken above will be found in Godsil and McKay (1980). 

We are now going to derive some information about the off-diagonal elements 
of W(D, x). The adjugate of x/ — A, i.e., the transpose of its matrix of cofactors, 
will be denoted by ¥(A,x). The most important property of ¥ is that 


W(A,x)(xl — A) = det(x/ ~ A)/. 


If A is the adjacency matrix of the directed graph D then (¥W(A, x)),; is equal to 
$(D \i,x). In this case we denote the ij-entry of ¥(A, x) by ,,(D, x). It is easy 
to show that 


x! W,,,(D, x) = by(D, x)/o(D, x). 


If A is ann xn matrix and U C {1,...,n}, we denote by (A, x) the (square) 
submatrix of YW with rows and columns indexed by the elements of U. We use 
A\ U to denote the matrix obtained by deleting the rows and columns indexed by 
U. We need the following crucial result, the combinatorial significance of which 
first seems to have been noted by Tutte (1947, 1979). 


Lemma 5.8 (Jacobi 1833). If A is an n xn matrix and U is a subset of {1,...,n} 
with m elements then 


det Wy(A,x) = (det(xI — A))™"! det(xI — (A\U)). 


Proof. We may assume without loss that U = {1,...,m}. Let M be the matrix 
obtained by replacing the first a: columns of the a x a identity matrix with the 
corresponding columns of ¥(A,x). Then the product (x/ — A)M has the form 


det(x/ — A) 1, 0 
( * hex savy)? Ce 
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where the diagonal blocks are square (and the details of the sub- diagonal block 
are irrelevant). Now det M = det ¥%,(A,x) and so we have 


(A, x) det Wy (A, x) = det({xI — A)M) 
= (det(xJ — A))™ det(x/ — (A\U)). 


(The last term is just the determinant of the matrix in (5.3).) This equation imme- 
diately yields the lemma. 


Lemma 5.8 is in fact a classical result, best described as well forgotten. It is 
sometimes referred to as “Jacobi’s identity”, which is not a particularly useful 
identifier. We will only be using it when |U| = 2. For ease of reference we restate 
this case in a modified form. 


Corollary 5.9. Let G be a directed graph with vertices i and j. Then 
$1 (G, x) bj(G,x) = 6(G \u,x)b(G\u, x) — 6(G,x)b(G\ {i,j}, x). 
When G is a graph, $,;(G, x) = $;(G, x) and. so Corollary 5.9 implies that 


$i (G,x) = V b(G\u,x) 6(G\v,x) — 6(G,x) &(G\ fi, Ff, ¥) (5.4) 


It might appear that the sign of #;;(G, x) is not determined by this expression, but 
we know that the rational function ¢;;(G,x)/(G, x) has non-negative coefficients 
when expanded as a scrics in x '. This implies that the leading term of 6; (G, x) 
is always positive. 


A very nice application of eq. (5.4) to graph reconstruction was found by Tutte. 


Theorem 5.10 (Tutte 1979). If the characteristic polynomial of the graph G is irre- 
ducible over the rationals then G is vertex-reconstructible. 


Proof. Let the vertex set of G be {1,...,n} and suppose #(G, x) is irreducible. 
We prove that for any two distinct vertices i and j of G, the polynomial $(G \ ij, x) 
is determined by #(G,x), 6(G\i,x) and @(G\j, x). We have 


$(G\i,x) d(G\j,x) ~ 6(G,x) 6(G\ij,x) = 6(G,x). (5.5) 
Now suppose that 7 is a polynomial such that 
b(G\i,x) $(G\j,x) — (G,x)q = 0? (56) 


for some polynomial o of degree at most n — 2. Then, subtracting (5.5) from (5.6), 
we obtain 


(Gx) (6(G\ij,x) ~ n) = 6,(G, x) - 0”. 


The right side of this equation is the product of two polynomials, each of degree 
at most n — 2. Since this product is divisible by }(G,x), which is irreducible of 
degree n, we are forced to conclude that 7 = $(G \ij,x). This proves our claim. 


—_ 
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As noted in the proof of Lemma 5.3, if H has m vertices then the coefficient of 
x"? in $(H,x) is equal to —1 times the number of edges in H. So, given $(G), 
$(G\i), 6(G\j) and #(G\ {i, j}) we can determine the number of edges joining 
i to j, i.e., whether or not i and j are adjacent. Therefore when $(G) is irreducible, 
the first three of these polynomials determine whether i and j are adjacent. 
To complete the proof we now recall that in Tutte (1979) it is shown that the char- 
acteristic polynomial of a graph G is determined by the collection of vertex-deleted 
subgraphs of G. Hence G is vertex-reconstructible when (G) is irreducible. 0 


The above proof still works if #(G) is not irreducible, but instead has an ir- 
reducible factor of degree n — 1. For another variation, suppose that #(G\1) is 
irreducible. An argument similar to the one above shows then that #(G), #(G\1) 
and #(G\ {1,i}) determine ¢(G\i). From this it follows again that G is vertex- 
reconstructible. This result was first proved, in apparently greater generality, in 
Yuan (1982). See also Godsil and McKay (1981). 

There are close connections between the theory of matchings in graphs and the 
topics we are discussing. To describe this we require some more notation. A k- 
matching in a graph is a set of k disjoint edges, no two of which have a vertex in 


common. The number of k-matchings in the graph G will be denoted by p(G,k). 
We call 


w(G,x) = Do(-1)kp(G, kx 
k 
the matchings polynomial of G. The task of computing this polynomial for a given 
graph is NP-hard (or, more preciscly, #P-complctc), since the constant term of 
u.(G, x) counts the number of perfect matchings in G and counting the number of 
perfect matchings in bipartite graphs is equivalent in complexity to determining 
the permanent of 01-matrices. From Valiant (1979), we know that the latter is 
NP-hard. One consequence of this is that, unless P=NP, there is no easy way of 
computing j2(G, x). 

Thus the matchings polynomial is in this regard a more intractable object than 
the characteristic polynomial of a graph. Nonetheless, it is known that G is a 
forest if and only if 4(G,x) = @(G, x) and there are also some simple recurrences 
that enable us to compute the matchings polynomials of small graphs with some 
facility. The matchings polynomials of bipartite graphs are essentially the same as 
“rook polynomials”. (For information on rook polynomials see Riordan 1958. For 
the matchings polynomial see Heilmann and Lieb 1972, Farrell 1979, Godsil and 
Gutman 1981, and Godsil 1981b, 1993, chapters 1 and 6.) 

An unexpected property of the matchings polynomial is that all its zeros are real. 
The first, second and third proofs of this are to be found in the above-mentioned 
paper of Heilmann and Lieb. For a combinatorialist this is perhaps not the easiest 
Paper to read, and it is probably a non-trivial task even to locate all three of the 
proofs just referred to.) A fourth proof will follow from the next result. The fact 
that the zeros are real is not without combinatorial significance. It implies, for 
example, that the sequence formed by the numbers p(G, k) (k =0,1,...) is log- 
concave. (This was noted by Heilmann and Lieb.) Another consequence is that, 
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in many cases of interest, the number of edges in a randomly chosen matching has 
exactly k edges is asymptotically normally distributed. (See Godsil 198la.) 


Theorem 5.11 (Godsil 1981b). Let G be a graph and let u be a vertex in G. Let 
T = T(G,u) be the tree with the paths in G that start at u as its vertices, and with 
two such paths adjacent if and only if one is a maximal subpath of the other. Then 


H(G\u,x) — w(T \u,x) 


u(G, x) wT, x) ~ 


(in the right side of the above identity, u denotes the one-vertex path consisting 
of u itself.) As we remarked above, when H is a forest we have n(H, x) = 6(H,x). 
So from Theorem 5.11 we deduce that all zeros and poles of the rational function 
u(G \u,x)/u(G, x) are real. A trivial induction argument on the number of ver- 
tices in G now yields the conclusion that all the zeros of w(G, x) are real. Another 
consequence of Theorem 5.11 is that 4(G\u,x)/u(G, x) is essentially a generat- 
ing function for a class of walks in G. (This because the right-hand side can be 
written as (7 \u,x)/(7T,x) and this is “essentially” a generating function, by 
Lemma 5.2.) 

Another connection between linear algebra and the theory of matchings is pro- 
vided by Pfaffians. We discuss this briefly. Let A = (a,;) be a skew-symmetricn x n 
matrix, ie, AT = —A. let ¥(n) be the set of permutations a of {1,...,2} such 


that all cycles of m have even length. (So ¥(n) is empty if n is odd. Then it is 
known that : 


det A = ( S> sig(a) wt(7))’. (5.7) 


TE F(n) 


Here wt(7) = J[/_, 4; i)” and sig(a) = +1. (The exact definition of sig(a) will not 
be needed.) The sum here is known as the Pfaffian of A. For more informa- 
tion about the Pfaffian, the reader is referred to Godsil (1993, chapter 7), Lovasz 
(1979a), Stembridge (1990), or Northcott (1984). 

Suppose now that we are given a graph G, and that we wish to determine 
whether it has a perfect matching. This can be done as follows. Let A = a;; be 
a skew-symmetric matrix such that a;; = 0 if i and j are not adjacent in G and, 
moreover, the numbers {a;;: i < j,ij € E(G)} are algebraically independent over 
the rationals. Then from (5.7) we see that det A is non-zero if and only if G has a 
perfect matching. This fact, together with Lemma 5.8, was used by Tutte to derive 
his characterisation of graphs with no perfect matchings. 

Instead of choosing the entries of A to be algebraically independent, we can 
also choose them at random. If detA #0 then G must have a perfect matching. 
If det A = 0 then we are left uncertain, but by repeating the experiment a number 
of times we can reduce the uncertainty to any desired level. This strategy was first 
suggested in Edmonds (1967), for bipartite graphs. For an elegant implementation 
of this idea and some related background information, see Mulmuley et al. (1987). 
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6. Eigenvalue methods 


In this section our study of adjacency matrices is continued, but now our emphasis 
will be on their eigenvalues, rather than on walks. We confine ourselves almost. 
entirely to graphs, which means that our adjacency matrices will be symmetric and 
their eigenvalues real. A great deal of effort has been devoted to the study of the 
relation between the structure of a graph G and the eigenvalues of A(G). Although 
this subject has considerable independent interest, we confine ourselves almost 
entirely to its applications. We begin by introducing two fundamental results from 
matrix theory, the first of which is a version of the well-known Perron—Frobenius 
theorems. (See, e.g., Cvetkovi¢ et al. 1980, Theorem 0.3.) 


Theorem 6.1. Let G be a connected graph. Then the largest eigenvalue p of A(G) 
is simple, and the entries of the corresponding eigenvector are all positive. If X is 
any other eigenvalue of A(G) then ’ > --p, with equality holding if and only if G 
is bipartite: The largest eigenvalue of any proper subgraph of G is less than p. 


(The most general, and most natural, form of the Perron-Frobenius theorem is 
concerned with non-negative matrices; the above version suffices for most of what 
we need.) If G has maximum degrec A and largest eigenvalue p then VA < p< A. 
The first inequality holds because the complete bipartite graph K; , is a subgraph 
of G and the second because G can be realised as a subgraph of a A-regular graph. 
(This also shows that we can have p = A if and only if G is regular.) 


Theorem 6.2. Let u be a vertex in the graph G. Then the eigenvalues of G \u inter- 
lace those of G (ie., between any two eigenvalues of G\u there lies an eigenvalue 


of G). 


Proof. Assume that G has n vertices and let A = A(G). If U is a subspace of R", 
define A,j(A) to be the minimum value of x'Ax as x ranges over the unit vectors 
in U. Denote the kth largest eigenvalue of A by A,(A). It is known that 


T 
. xX AX 
A,(A) = max min—;—. 
dim(V) kx xx 


Let S be an m x n matrix with orthonormal rows, i.e., SST = J,,. Then we have 
A, (SAS™) = gmax Au (SAS") = , max AsulA) 

whence it follows that 
A,(SAS") < A,(A). (6.1) 

Applying the same argument to —A, we further deduce that for k = 0,...,, 
Am-x(SAS") > Ay, (A). (6.2) 


If we now choose S to consist of 1 — 1 rows of the identity matrix /,, then we 
obtain the theorem. 
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The interlacing property of the eigenvalues of symmetric matrices was first noted 
in mechanics, arising in the study of the behaviour of a (mechanical!) system as new 
constraints are imposed on its parameters. The proof we have given is based on 
Haemers (1979). Haemers has used egs. (6.1) and (6.2) above to obtain a number 
of interesting results in graph theory and design theory. It is worth noting that 
there is a connection here to the theory of quotients. Suppose that, in our usual 
notation, we have PA = ®P where P is the characteristic matrix of an equitable 
partition. Choose A to be the non-negative diagonal matrix such that A? = PPT. 
Then A~'PA = A“'@A- A~'P and so we may set S = A~'P and I= A“'®A to 
obtain SAS! = I. The rows of S are pairwise orthogonal and thus the inequalities 
(6.1) and (6.2) follow. 

Theorem 6.2 implies that any cigenvalue of G with multiplicity greater than 
one must also be an eigenvalue of any vertex-deleted subgraph G\u. Another 
consequence is that the least eigenvalue of G\u is bounded below by the least 
eigenvalue of G. Thus, the class of all graphs with least eigenvalue greater than a 
fixed number a is closed under the operation of taking subgraphs. The study of 
these classes turns out to be quite interesting, so we discuss it brieffy. 

Denote the least eigenvalue of G by A,,;,(G). Since the eigenvalues of K, are 

1 and 1, it follows that Ayia <1 for any graph G with at Icast one edge. The 
eigenvalues of K, ) are —V2, 0 and V2, whence we deduce that if G is connected 


and not complete then Aninc) < ~/2. A more interesting case is the class of 
graphs with Aj, 2 —2. It can be shown that all line graphs have this property, 
along with the so-called “generalised line graphs”. Considerable effort was devoted 
to characterising the remaining graphs in this class before Cameron et al. (1976) 
produced a short, ingenious and elegant solution. 

Their work was all the more interesting in that it was based on a connection with 
the theory of root systems. We outline the way this arises. Let G be a graph with 
vertex set V(G) = {1,...,} such that A(G) + 2/ is a positive semidefinite matrix. 
There is a matrix X, with linearly independent columns, such that A(G) + 2/ = 
XX". Let x; be the ith row of X. Then 


2, ifi=j; 
@px)~ 41, ing 
0, otherwise. 


Let £ be the lattice formed by the set of all integral combinations of the columns 
of X. If x is a row of X then the mapping 


ara (a,x)x 


fixes £&. (Note that this mapping represents reflection in the hyperplane in R” 
perpendicular to x.) From this it follows that the vectors x,;, for i in V(G), are 
a subset of a root system. (For an elementary and pleasant introduction to root 
systems, see Grove and Benson 1985.) 

It would appear that this topic is far from being exhausted. Neumaier (1979) 
showed that, with finitely many exceptions, the strongly regular graphs with 
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Amin = ~k (for some positive integer k) belong to one of two infinite families. 
(The strongly regular graphs G with Aig) ot an integer fall into a third infi- 
nite family.) Hoffman (1977) shows that a graph with A,,,, 2 —1 — V2 and having 
“large” valency is a generalised line graph, and consequently has least eigenvalue 
at least —2. (Here “large” is determined by Ramsey theory, and is thus only tech- 
nically finite.) This is an intriguing result. 

The eigenvalues of a graph also give information about its chromatic number, 
and related quantities. Let A,,,,(G) denote the largest eigenvalue of G. We denote 
the chromatic number of G by y(G). 


Theorem 6.3 (Hoffman 1970). The chromatic number of a graph G is bounded 
below by 1 ~ Ayjay(G)/Amin(G).- 


Proof. Let z be an orthonormal eigenvector of G with eigenvalue A,,,,(G). Assume 
that G can be properly coloured with ¢ colours. Such a colouring determines a 
partition of V(G) with c cells and characteristic matrix P. Let P be the matrix 
constructed from P by replacing the non-zero entry in column i of P by the 
corresponding entry of z, and then deleting all zero rows. The rows of P are not 
orthonormal, but there is a unique non-negative diagonal matrix A such that the 
rows of S := AP are. There is also a vector y such that y'S = z?. Consequently 
y"SAS"y = 27 Az = Amax(G), 

which implies that A,,,,(G) < Ayja,(SAS"). On the other hand, since the rows of S 
are pairwise orthonormal, inequalities (6.1) and (6.2) apply. Thus we deduce that 
Xmax(G) = Amax(SAS™) and accordingly that 


{c i 1)Amin(SAS") + Amax(G) a (c = L)Amin(SAST) + Amax(SAS"). 


By (6.2), the left-hand side is bounded below by (¢ ~ 1)Agin(G) + Agmax(G). The 
right-hand side is bounded above by trSAS". It is easy to see that the diagonal 
entries of SAS" are all zero, hence the sum of its eigenvalues is zero. This implies 
that 


(C = NAgpin(G) + Aynax(G) <0 


max 


and this yields the theorem. 


In deriving Theorem 6.3 we did not use the fact that the non-zero entries of 
A are all equal to 1; in fact a careful reading will show that we have actually 
proved that if B is a symmetric matrix such that (B),;; = 0 whenever i and j are 
non-adjacent vertices in G then 


X(G) 2 1- Awax€B)/Amin(B)- 
If A,yin(B) = —7 then C := 1 + 77'B is a positive semidefinite matrix with diagonal 


entries equal to 1 and with (C),; = 0 whenever i and j are distinct non-adjacent 
vertices in G. This leads us to the following. 
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Corollary 6.4. Let G be a graph on n vertices and let Q(G) be the set of all positive 
semidefinite matrices C such that (C),; = 1 for all vertices i of G, and (C), j=9 
whenever i and j are distinct non-adjacent vertices. Then 


x(G) 2 max Appa (C). 
CEeN(G) 


The complement of the graph G will be denoted by G. The quantity 
max{Amax(C) | C € Q(G)} 


is usually denoted by 6(G). Thus Corollary 6.4 asserts that y(G) > 6(G). Now 
suppose that the vertices in the subset S of V(G) induce a complete subgraph of 
G. Let Cs be the Ol-matrix with ij-entry equal to 1 when i and j both lie in C, 
and equal to zero otherwise. Then C, € Q(G) and A,,,,(Cy) = [S|. This shows that 
0(G) > a(G), or equally that @(G) 2 a(G). (Here a(G) is the maximum number 
of vertices in an independent set from G.) 

The quantity 6(G) was first introduced in Lovasz (1979b), where he established 
that it provides an upper bound on the “Shannon capacity” of G. We discuss this 
briefly. If G and H are graphs, let us denote by G x # their strong product. This 
can be defined as the graph with 


(A(G) +1) @ (A(H) +1) -T 


as its adjacency matrix. (Thus the vertex set of G x H is the Cartesian product 
of V(G) and V(#), and the pairs (u,v) and (u’,v’) are adjacent if and only if u 
is equal or adjacent to «’ in G and v is equal or adjacent to v’ in H. The strong 
product of n copies of G will be denoted by G". It is not hard to show that 
a(G x H) > a(G)a(H) and from this one can deduce that the Shannon capacity 


@Q(G) := lim sup(a(G")!/") 


exists. The significance of 6(G) stems from the facts that it is an upper bound 
for a(G), and that it is multiplicative, i.e., 0(G x H) = 0(G)0(H). Together these 
imply that O(G) < 6(G). (For the proof that 6(G) is multiplicative we refer the 
reader to Lovdsz 1979b.) Note that it is not difficult to verify that Q(G x H) 
contains 2(G) @ Q(H), and this implies that 6(G x H) > @(G)@(H). It is proved 
in Grétschel et al. (1981) that @(G) can be computed in polynomial time. Lovasz 
tound a number of different expressions for 6(G). One of these is, in a sense, dual 
to our definition. 


Theorem 6.5 (Lovasz 1979b). For any graph G, let M(G) denote the set of all pos- 
itive semidefinite matrices such that ty B = \ and (B);; = 0 if i and j are distinct 
vertices of G. Then 


é6(G) = min trVB). 
Bed(G) 
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Using the theory he developed, Lovdsz was able to deduce the value of @(G) in 
many new cases. (The smallest of these was C,, the cycle on five vertices, while 
@(C;) is still unknown. This gives some idea of the difficulty of this problem.) 
Haemers found a simple argument which sometimes provides a better bound on 
@(G) than 6(G) does. He observed that, if A 4 0, then the submatrix of A(G) + Al 
corresponding to an independent set on s vertices is just A/,. Hence it is non- 
singular and so we deduce that 


a(G) < rank(A + Al). 


From this it can be shown that rank(A + A/) is an upper bound on @(G). For more 
information, and examples where this bound is better than 0(G), see Haemers 
(1981). 

Eigenvalue methods have also been applied to graph factorisation problems. 
The next example is possibly the best known of these. 


Lemma 6.6 (Graham and Pollak 1972). The edge set of K,, cannot be partitioned 
into fewer than n — 1 complete bipartite subgraphs. 


Proof. Let G be graph on a vertices that is the edge-disjoint union of subgraphs 
H,,...,H,. Assume that each of these subgraphs H; is a spanning subgraph of G 
consisting of a complete bipartite graph, together with some isolated vertices. We 
assume without proof the easily established fact that if H is a complete bipartite 
graph on m vertices then there is an m-dimensiona] subspace U of R” such that 
the inner product («, A(//)u) is non-negative for all vin U. (in fact U is spanned 
by the eigenvectors of A(#) with non-negative eigenvalues.) We say that U is 
non-negative for A(H). It follows that we can associate to each subgraph H, an 
(n — 1)-dimensional subspace of R” that is non-negative for A(H;). 

The intersection of the r subspaces U,,..., U, has dimension at least n — r and 
so, if r <n-— 2, there is a 2-dimensional subspace U’ of R” that is non-negative 
for the A(G). In U’ we can find a non-zero vector z orthogonal to the “all-ones” 
vector j such that (z, A(G)z) 2 0. Now suppose that G = K,,. Then A(G) =J ~1 
and so, if z is a non-zero vector orthogonal to j, then (z, A(G)z) = —(z,z) < 0. 
This shows thatr >n—-—2. 


The argument just used can be rephrased in terms of real quadratic forms, and 
in this setting even shorter proofs of Lemma 6.6 can be found. One corollary of the 
above proof is that a graph on n vertices with exactly m non-negative eigenvalues 
cannot be expressed as the edge-disjoint union of fewer than # -- m complete 
bipartite graphs. We note another result that can be proved with the method at 
hand. 


Lemma 6.7 (Schwenk 1983, 1987). The complete graph on 10 vertices cannot be 
expressed as the edge-disjoint union of three copies of Petersen's graph. 


Proof, Assume that we have 


Jig ~ lig = At B+C, 
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where A, B and C are 01-matrices and A and B are both adjacency matrices of 
copies of Petersen’s graph. It is known that the eigenvalues of Petersen’s graph 
are —2, | and 3, and that the eigenvalue 1 has multiplicity six. Let T and U be 
the eigenspaces associated to the eigenvalue 1 of A and B respectively. Since / is 
an eigenvector with eigenvalue 3 for both A and B, it follows that T and U both 
lic in the 9-dimensional subspace of R" formed by the vectors orthogonal to j. 
Consequently they must have a non-zero common subspace, which we assume is 
spanned by a vector z. Then (J — /)z = ~z and so Cz = (-~3)z. Thus C has ~3 
as an cigenvalue, and so cannot be the adjacency matrix of (a copy of) Petersen’s 
graph. O 


Note that the matrix C must be the adjacency matrix of a cubic graph and that, 
by Theorem 6.1, a cubic graph with least eigenvalue equal to —3 is bipartite. Thus 
the above method is providing more information than is contained in the statement 
of the lemma, and it also can easily be applied to other situations. It could, for 
example, be used to study the possibility of partitioning the edges of K,, into three 
copies of some given strongly regular graph (on vertices). 

Mohar (1992) develops a relation between graph eigenvalues and Hamiltonicity. 
One consequence of this theory is a proof that the Petersen graph does not contain 
a Hamilton cycle. There is an amusing direct proof of this using interlacing, which 
we now describe. Suppose by way of contradiction that there was a Hamilton 
cycle in the Petersen graph. Then the line graph L(P) of the Petersen graph 
would contain an induced copy of Cj), and so, by interlacing, 0,(Ciq) < 06{(L(P)) 
fori+1,...,10. But in fact @,(C\)) > 0,(L(P)), so the Hamilton cycle cannot exist. 
(This argument fails to prove that the Coxeter graph has no Hamilton cycle; it 
would be very interesting to find an extension of this argument which would work 
for the Coxeter graph.) 

Our next topic is the connection between graph eigenvalues and connectivity. 
For this it is sometimes convenient to use modified forms of adjacency matrices. 
We discuss them briefly. 

If Gis a graph on n vertices, let A = A(G) be the n x nm diagonal matrix with A,; 
equal to the valency of the ith vertex of G. The incidence matrix of B = B(G) of 
G is the 01-matrix with rows indexed by the vertices of G, columns by the edges 
and with (B),; equal to 1 if and only if vertex i is in edge j. Then we have 


BB" = A(G)+A(G), —- B'B = 21 + A(F£(G)), 


where £(G) denotes the line graph of G. (Remark: since B'B is positive semidefi- 
nite, it follows that A,i,(£(G)) > —2, as we mentioned in the discussion following 
Theorem 6.2.) 

An orientation of G can be defined to be a function @ on V x V such that 
a(u,v) = —o(v,u), and is zero if u and v are not adjacent. If o(u,v) = 1 we call 
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v the head and u the tail of the edge {u,v}. The pair (G, a) is an oriented graph. 
The incidence matrix B” of (G,«) is defined by 


1, if x is the head of e; 
(B’),. = 4-1, if x is the tail of e; 


0, otherwise. 
The pertinent property of B” is that 
B’(B’)' = A(G)— A(G).  ~ (6.3) 


Much of our notational effort is gone to waste, since the right-hand side of (6.3) 
is clearly independent of the orientation 7. We do deduce, however, that A—A 
is a positive semidefinite matrix. The multiplicity of 0 as an eigenvalue of A—A 
is equal to the dimension of the null-space of B”. This in turn is known to equal 
n—c, where c is the number of connected components of G. (One reference for 
the unproved assertions here is Biggs 1993.) (If G is bipartite then 4 —A and 
A+ A are similar matrices. ] know of no reference for this. However, in this case it 
is easy enough to find a diagonal matrix A, with diagonal entries equal to +1, such 
that B’ -- AB. Then A(A — A)A = A(A+A)A and, since A = A~', this proves the 
claim.) 

Let A,(G) denote the second smallest of the n eigenvalues of 4 — A. From our 
remarks above we see that A,(G) 4 0 if and only if G is connected. A study of 
the relation between A, and connectivity has been made by Fiedler (1973). We 
observe that if we delete the first row and column from A— A we obtain a matrix, 
D say, differing from A(G\1) — A(G\1) by the addition of some non-negative 
terms to its diagonal. From this it can be deduced that the ith eigenvalue of D is 
at least as large as the ith eigenvalue of A(G\1) — A(G \ 1). Since the eigenvalues 
of this latter matrix interlace those of A — A, we conclude that A,(G\ 1) < A,(G). 
This implies, as noted by Fiedler, that A,(G) is a lower bound on the vertex- 
connectivity of G. In fact it can be argued that it is more natural here to consider 
edge-deleted subgraphs, rather than vertex-deleted subgraphs of G. For ife € E(G) 
and H := G\e than the difference between A(G) — A(G) and A(H) — A(H) isa 
matrix with rank one. This implies that the cigenvalues of G \e interlace those of 
G. 

If X C V(G), let OX denote the number of edges of G with one end in X and 
the other not in X. We have the following. 


Lemma 6.8. Let G be a graph with nvertices and let X be a subset of V(G). Then 
|9X| > A2(G)|X||V \ X]/n. 
Proof. Let j be the vector with all entries equal to |. Since the rows and columns 
of A — A all sum to 0, we always have (A — A)j = 0. This implies that 
A2(G) = min{(z, (A — A)z) | (z,/) = 9, |Izll = 1}- 
We also have 


(z,(A-A)z)= D> zy. 


ijeE(G) 
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Now define z by setting z; equal to a@ when i € X, and to B otherwise. Choose 
« and B so that (z,j) =0 and ||z|| = 1. Then (z,(4 — A)z) = |OX|(@ — B)?. After 
some calculation we arrive at the statement of the lemma. 0 


A more general result, using the same basic approach of “guessing” a trial eigen- 
vector z for Aj, can be found in Alon and Milman (1985, Lemma 2.1). Their work 
is devoted to a study of “expanders”. We will not discuss these further, but instead 
refer the reader to chapter 32. This subject is perhaps the most important recent 
application of graph eigenvalues to combinatorics. 
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7. Appendix: Random walks, eigenvalues, and resistance (L. Lovdsz) 


The results of sections 5 and 6 concerning the walk generating functions of graphs 
are closely related to random walks on graphs and to the theory of finite Markov 
chains, and also to the electrical resistance of the graph. For more on this topic, 
see Doyle and Snell (1984), and Lovasz (1979a, second edition, chapter 11). 

Let G be a d-regular connected graph on n vertices with adjacency matrix A. 
(Most of the results below extend to non-regular graphs, but the formulations are 
much simpler for regular graphs, We can reduce the general case to this by adding 
a sufficient number of loops at each vertex; here, a loop adds only 1 to the degree.) 

Consider a random walk on G: starting at a node ug, at each step we are at a ver- 
tex v,, and move to each neighbor with probability 1/d. Let v, be the random vertex 
we are at after ¢ steps. Clearly, the sequence of random vertices (v,:t = 0,1,...) is 
a symmetric Markov chain, and P = d~'A is the matrix of transition probabilities. 
(In fact, every symmetric Markov chain can be viewed as random walk on a graph, 
if we allow weighted edges. Most facts mentioned below extend quite naturally to 
all symmetric Markov chains; many extend even to non-symmetric ones.) 

Random walks arise in many models in mathematics and physics. For example, 
consider the shuffling of a deck of cards. Construct a graph whose vertices are all 
permutations of the deck, and two of them are adjacent if they come by one shuffle 
move, depending on how we shuffle. Then repeated shuffle moves correspond 
to a random walk on this graph (see Diaconis 1988). Many models in statistical 
mechanics can be viewed as a random walk on the set of states. 

Random walks have important algorithmic applications. They can be used to 
reach “obscure” parts of large sets, and also to generate random elements in large 
and complicated sets, such as the set of lattice points in a convex body, elements of 
finite groups (see chapter 27) or the set of perfect matchings in a graph (which, in 
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turn, can be used to the asymptotic enumeration of these objects). See Aleliunas 
et al. (1979), Jerrum and Sinclair (1989), Dyer ct al. (1989) for some of these 
applications. 


The probability py that, starting at i, we reach j in ¢ steps is the ij-entry of P’. 
We define the probability generating function for the random walks on G to be 


P(G,x) = SoxIP! =(1-xP)"'. (7.1) 
t=0 3 


This is of the same form as the walk generating functions studied earlier, and one 
can apply much of the theory described in the last two sections. 

Since P is symmetric, its eigenvalues are real. A trivial eigenvalue of P is 1, with 
the corresponding eigenvector (1,..., 1)". It follows from the Frobenius—Perron 
theory that this eigenvalue is unique and that P has spectral radius 1. The value 
~1 is an eigenvalue of P iff G is bipartite. 

Let l= A, 2--- DA,, be the eigenvalues of P (these are just the eigenvalues of 
A divided by d), and let v;,...,v,, be corresponding eigenvectors (normed to unit 
length). Let vy = (v,),---,Ugn)". Clearly we can take u,, = 1/J/n. 

Expressing A in terms of its eigenvectors, we get 


n 
A= Ss A,V,VE 
k=! 


and hence 
( n 1 n 
y 
Py = 2 Mae =—+ » AVY ej- (7.2) 


We shall see how this basic formula can be applied in the analysis of random walks; 
but first let us introduce some parameters that are significant in the algorithmic 
applications mentioned above. 

(a) The mean access time 7,; is the expected number of steps required to reach a 
vertex j, starting from a vertex i. The sum y;; = 7;; + 7); is called the mean commute 
time. 

(b) The mean cover time is the expected number of steps to reach every vertex 
(starting at the vertex for which this is maximum). 

(c) The mixing rate is a measure of how fast the random walk converges to its 
limiting distribution. (How long should we shuffle a pack of cards?) This can be 
defined as follows. If the graph is non-bipartite, then By — 1/n ast — oo, and the 
mixing rate is 
1 t/t 


(1) 
Pi; ee 


p = limsup max 
iy n 


{90 


(For a bipartite graph with bipartition {V,,V,}, the distribution of v, oscillates 
between “almost uniform on V,” and “almost uniform on V,”. The results for 
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bipartite graphs are similar, just a bit more complicated to state, so we ignore this 
case.) 

We have to walk about (log)(1 — «)~! steps before the distribution of v, will 
be close to uniform. The surprising fact, allowing the algorithmic applications men- 
tioned above, is that this number may be much tess than the number of nodes; for 
an expander, for example, this takes only O(log) steps. 

An algebraic formula for the mixing rate is easily obtained. Let A = max{|A4|, |A, |}, 
then it follows by (7.2) that 


n 
( 1 
n= La toweyl <a 
k=2 
So w <A; it is not difficult to argue that equality must hold here. 


Theorem 7.1. The mixing rate of a random walk on a non-bipartite graph G is 
max{{Ao|,|A,|}- 


Lemma 6.8 has established a connection between the second-largest eigen- 
value of A (equivalently, of P) and a certain edge-connectivity property of the 
graph. We define the conductance ® = @(G) of the graph G as the minimum of 
n|OX|/(d|X||V \ X|) over all non-empty sets X C V. Combining Lemma 6.8 with 
results of Jerrum and Sinclair (1989) we obtain the following (cf. Alon 1986, Dia- 
conis and Stroock 1991, and also chapter 32, Theorems 3.1 and 3.2). 


Theorem 7.2. @°/4 << 1—A, < ®. 


Corollary 7.3. 
P\' 
<(i-5). 
4 


The mean access time and the mean commute time can be estimated by el- 
ementary means (but, as we shall see later, eigenvalues provide more powerful 
formulas). We remark first that in a very long random walk, every vertex is visited 
on the average in every mth step and every edge is traversed in each direction on 
the average in every (2/1)th step, where mm is the number of edges. (This second 
assertion remains valid also for random walks over non-regular graphs.) Hence it 
follows that if we start from node /, and j is an adjacent node, then within 2m 
steps we can expect to pass through the edge ji; hence the mean commute time 
for two adjacent nodes is bounded by 2m. It follows that the mean commute time 
between two nodes at distance r is at most 2mr < n>. A similar bound follows for 
the mean cover time. 

Let dy; denote the probability that the random walk starting at i hits vertex j 


the first time in the tth step. Then we have the following identity by easy case 
distinction: 


t 
Sr gle) le) 
Pi = oa Py 


s 0 


wl 
Pip — > 


Hn 
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Hence we get for the generating functions f,;(x) = SS yal pix! and gj;(x) = 
Dog) x! that 

fi) = 8i(X)f;j(%). 
Now here 


n 


Dgiv 
fj) = ee 7 A 


1=0 k=l fot Aa 


and so 


“~ UKiVK; UK 
81x) = tT Ag me [Soi 
Now 7; = Pt from this explicit formula we get the following. 


Theorem 7.4. The mean access time is given by 


Since the vectors u; = (v;,)7_, are mutually orthogonal unit vectors, we can 
derive the following bound on the mean commute time between any pair of nodes: 


“4 (Ugi — U,;)" 1 es 2 
Yijun >> = i S ya = As 2 . xj) 


k=2 


1 
ar i (u; uj) = 2n/(1 — Ad). 
Using Theorem 7.2, we get 
8 
Vij gS gp’ 


which is better than the elementary bound if, ¢.g., the graph is an expander: in 
this case we obtain that y,; = O(7). It also follows from Corollary 7.5 that the 
mean commute time between any two vertices of any regular graph on n nodes 
is at least n, so this is best possible for expanders. The best known bound for the 
mean commute time in a general regular graph is O(n’), which follows from the 
analogous bound for the mean cover time below (sce Brightwell and Winkler 1990 
for the best possible bound in the case of general graphs). 

No eigenvalue formula for the mean cover time is known, but a rather good 
bound follows by elementary probability theory (Matthews 1988). 
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Proposition 7.6. The mean cover time of a random walk on a graph with n vertices 


is at most (1+ t +---+1/n) times the maximum of the mean access times between 
all pairs of vertices. 


The mean cover time of a regular graph is O(n’) (Kahn et al. 1989; this issue of 
J. Theoret. Probab. contains many other interesting papers on this problem). This 
gives a surprisingly narrow range for cover times. It is conjectured that the graph 
with smallest cover time is. the complete graph (whose cover time is = nlogn). 
This was recently proved in the asymptotic sense by Feige (1993). 

There is an interesting connection between random walks on graphs and elec- 
trical networks. We may consider a graph G on n vertices as an electrical network, 
every edge corresponding to unit resistance. The network has some resistance R,; 
between any pair of vertices. A whole book has been written on this connection 
(Doyle and Snell 1984, see also Spitzer 1976); here we only formulate one surpris- 
ing identity (Nash-Williams 1959, Chandra et al. 1989): 


Theorem 7.7. The mean commute time between vertices iand j is ndR;;. 


The proof (which is only sketched) is connected to yet another interesting notion. 
We call a function ¢:V(G) — R harmonic with poles s and ¢ if 


S> 6) =do/) 


i€N(j) 


for every j #5,¢. It is easy to see that if we normalize so that @(s) =1 and 
(t) = 0, then the harmonic function with given poles is uniquely determined. 

There are (at least) two rather natural ways to construct such harmonic functions. 

(1) Consider the graph as an electrical network as above. Give voltage 1 to s 
and voltage 0 to ¢. Then the voltage #(i) of vertex i defines a harmonic function. 

(2) Let $(é) denote the probability that a random walk starting at i hits s before 
it hits ¢. It is trivial that this defines a harmonic function. 

Now the resistance R,, is 1/(total current) = 1/ }°;.yq) (i). On the other hand, 
consider a very long random walk, with K steps, say. This hits ¢ about K /n times. 
Call a hit interesting if after it the random walk hits s before it hits ¢ again. Between 
two interesting hits, the average number of steps is y,,. Now the probability that 
a given hit is interesting is 1/d )°i;-vq) &(é), by interpretation (2) of the harmonic 
function. Hence the number of interesting hits is about 1/d )°j.yq) &(/)(K/n), and 
so the average number of steps between them is nd/(d jen PC) = ndRy, 
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1. Introduction 

Tools from many areas of mathematics are standard in certain branches of combinatorics and are 
described in detail in some of the chapters of this Handbook. Examples are the use of linear and 
multilinear algebra in the theory of designs and in extremal set theory, the use of finite groups in 
coding theory, application of representation theory of the symmetric group for deriving 
combinatorial identities, and application of probability theory for obtaining asymptotic existence 
proofs of combinatorial structures; the use of convexity and linear programming in combinatorial 
optimization, and the use of topological methods in the study of posets, convex polytopes and 
various extremal problems. 

The objective of this chapter is to survey some sporadic results from several areas of mathematics 
which were used successfully in solving certain combinatorial problems. It is believed that these 
results will soon be integrated into the mathematical machinery commonly used in combinatorics. 
We fully realize the arbitrariness of any such selection and do not claim that these are the most 
important examples that could be listed. We have, however, no doubt that they merit to be 
mentioned in this chapter. 

The combinatorial applications described here apply various tools from several areas of 
mathematics. It is natural to wonder whether the use of all these powerful tools is necessary. After 
all, it is reasonable to believe that combinatorial statements can be proved using combinatorial 
arguments. Pure combinatorial proofs are desirable, since they might shed more light on the 
corresponding problems. 

No such combinatorial proofs are known for any of the main results discussed in this chapter. It 
would be nice to try and obtain such proofs. 

One of the major consumers of powerful mathematical tools in combinatorics is the area of explicit 
constructions. The existence of many combinatorial structures with certain properties can be 
established using the probabilistic method. It is natural to ask for an explicit description of such a 
structure. Such a construction is particularly valuable when the required structure is needed for 
solving a certain algorithmic problem. In sections 2-4 we describe several mathematical tools used 
for explicit constructions. These include the use of group theory for constructing graphs without 
short cycles, the use of the theory of representations of Lie groups for constructing expanders, and 
the application of certain results from analytic number theory for constructing pseudo random 
tournaments. We note that an exact definition of the notion “explicit” (or “uniform”) construction 
can be given, but we prefer its intuitive meaning here. In sections 5-9 we survey some 
combinatorial applications of results from other mathematical areas, including real and complex 
algebraic geometry, algebraic and analytic number theory and hyperbolic geometry. 

Most of the mathematical background for the material in this chapter can be found in the books 
and surveys given in the following list: section 2: Magnus et al. (1966); section 3: Newman (1972); 
section 4: Schmidt (1976); section 5: Hocking and Young (1961); section 6: Borevich and 
Shafarevich (1966); section 7: Redei (1973), Van der Waerden (1931); section 8: Coxeter (1956); 
section 9: Stanley (1983). 
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2. Group theory and graphs with large girth 


The girth of a graph G is the length of the shortest cycle in G. If G=(V,E) isa 
d-regular graph with n nodes and girth g > 2k, then 


d-(1+(d—1)+---+(d-1)"') <a, 


since the left-hand side of the last inequality is precisely the number of nodes 
within distance k from a given node v of G. Therefore, 


g<2+2logn/log(d—1). 


Thus, for any fixed d > 3, the girth of a family of d-regular graphs can grow at 
most at the rate of the logarithm of the number of nodes. Erdés, Sachs, Sauer 
and Walther (cf. Bollobds (1978), pp. 103-110) proved the existence of d-regular 
graphs with girth g and 7 nodes, where 


g > logn/log(d —1). (2.1) 


Although their proof does supply a polynomial time algorithm for constructing 
such graphs, their graphs are not really explicit in the sense that it is not clear how 
to decide efficiently if two vertices of such a graph are adjacent, given their names. 

It seems more difficult to construct explicitly for some fixed d 23, a family of 
d-regular graphs whose girth grows at the rate of the logarithm of the number of 
nodes. Such a construction was first given by Margulis (1982) who used Cayley 
graphs of factor groups of free subgroups of the modular group. His construction, 
together with some related results, is outlined below. 


2.1. Cayley graphs 


Let H be a finite group with a gencrating set 6 satisfying 5=6~-', 1 ¢ 6. The 
Cayley graph G = G(H,6) is a graph whose nodes are the elements of H in which 
u and v are adjacent iff u=sv for some s € 6. Clearly, G is || regular and a 
cycle in G corresponds to a reduced word in the generators which represents the 
identity of H. Cayley graphs are fairly obvious candidates for regular graphs with 
large girth, since it is not too difficult to see that for every d and g there exists 
a d-regular Cayley graph with girth at least g. This is equivalent to the group 
theoretical property of residual finiteness and is proved as follows (see, e.g., Biggs 
1989). 

Let T be a finite d-regular tree of radius r with center w, whose edges are 
properly d-colored. Define d permutations 7,..., 7g on the nodes of T by putting 
a;(u) = v if {u,v} is an edge of T colored i, and 7;(u) = wif u is a leaf of T and the 
color é is not represented at u. Clearly 7,...,7, are involutions and they generate 
a group of permutations //. Pul 6 = {m,...,7,} and consider the Cayley graph 
G = G(H, 5). Consider the effect of a reduced word in the 1; on the central node 
w. Initially, each element of the first r elements of the word moves w one step 
towards the boundary. To return the image of w to itself another r +1 elements 
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are required. Hence, the girth of G is at least 2r + 1. We note that a more careful 
analysis will show that the girth of G is, in fact, at least 4r + 2. 

The last construction is explicit but gives a much weaker lower bound for g 
than the one given in (2.1). More efficient solutions can be obtained using familiar 
groups. 49 


2.2. The construction of Margulis 


For a commutative ring K with identity, let S¢(2, K) denote the group of all two- 
by-two matrices over K with determinant 1. Consider the integral matrices 


1 2 1 0 
A={) | and B=|) | 


and put 6 = {A,B,A~', B~'}. For a prime p, let f, be the natural homomorphism 
of S@(2,Z) onto Se(2,Z,) given by fp(aij) ~ (ai; (mod p)). Put A, = fp(A), By = 
fp(B) and let G, be the Cayley graph G(S¢(2,Z,), f(4)). 


Theorem 2.1 (Margulis 1982). G, has ny = p(p? -1) nodes and is 4-regular. Its 
girth g, is at least 2\og,,(p/2) -- 1, where a = 1+ V2. Hence g, > 0.83 logn, /log3 
3. 


Proof. The first statement is trivial. To bound g = gp, we estimate k, defined as 
the largest integer such that any two distinct paths in G, of lengths < k starting 
at J = ad end at different vertices. Clearly gy > 2k — 1. Given two such paths 
P = (po, pi,---,Pr) and Q = (qo, 41,---.q), Starting at po = qy =/ and ending at 
Pr=q1, let V = (y,...,0,) and W = (w,,...,w,) be the corresponding reduced 
words over 5,. Clearly v; ---v, = w, ---w;. Define u; to be A if uv; = Ap, B ifu; = Bp, 
A”! if u;=A;,', and B"' if v; = B,', and define w; analogously. The crucial fact 
(cf., e.g., Magnus et al. 1966) is that {A,B} generate a free subgroup of S@(2, Z). 
Thus v, ---v, 4 W, ---w, and since f,(U; ---0,) = f,(w1---w,) we conclude that all 
elements of the non-zero matrix U, ---U, — W, ---w, are divisible by p and hence its 
norm is at least p. Here the norm |{L]| of a matrix L is sup, yg [| Lx||/|[x||- This im- 
plies that max(|l0, ---0,({, ||, ---W,||) > p/2, and since the norms of A,B,A7', B™! 
are all a = 1+ V2, as is easily checked, we conclude that a* > a™*(") > p/2 and 
k > 2log,(p/2), as needed.  O 


2.3. Other constructions 


Modifying the methods of Margulis, Imrich (1984) constructed, for every integer 
d > 3, infinitely many d-regular Cayley graphs G whose girth g(G) and number of 
nodes n(G) satisfy 


g(G) > 0.48 logn(G)/log(d — 1) -2. 
For d = 3 he obtained 
g(G) > 0.9602 log n(G)/ log 2 ~ 5 
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which is very marginally worse than the bound given in (2.1), produced by non- 
explicit methods. 

The best estimate, for infinitely many degrees, was finally obtained by explicit 
constructions. Using certain algebraic computations in an appropriate algebra of 
quaternions, Weiss (1984), showed that the members of a certain family of bipartite 
cubic graphs, explicitly constructed by Biggs and Hoare (1983), have very large 
girth. The order m and the girth g of each of these graphs satisfy 


g > tlogn/log2-4. 


More generally, Margulis (1984) and, independently, Lubotzky et al. (1986, 1988), 


constructed, for any prime p = 1 (mod 4), a family of d = (p + 1)-regular graphs G 
with 


g(G) > tlogn(G)/log(d — 1) — log 4/log(d — 1) . 


Their graphs are Cayley graphs of factor groups of the modular groups, and 
they have several other interesting properties. These properties are summarized 
in Theorem 3.5 in the next section. Both constructions improve the bound given 
in (2.1) and supply one of the rare examples in which an explicit construction 
improves a non-explicit onc. 


3. Expanders and superconcentrators 


One of the best examples of the use of powerful mathematical tools for explicit 
constructions is the construction of expanders. For our purposes here, we call a 
graph G = (V, E) an (n,d,c)-expander if it has n nodes, the maximum degree of a 
node is d, and for every set of nodes W C V of cardinality |W| < 1/2, the inequality 
|N(W)| >c-|W| holds, where N(W) denotes the set of all nodes in V\W adjacent 
to some node in W. We note that the common definition of an expander is slightly 
different (see, e.g., Gabber and Galil 1981), but the difference is not essential. A 
family of linear expanders of density d and expansion c is a set {G;}%, where G; 
is an (7;,d,c)-expander, n; -> 00 and 7,1 /nj; -> 1 as i > 90. 

Such a family is the main component of the parallel sorting network of Ajtai 
et al. (1983), and in the construction of certain fault tolerant linear arrays. It also 
forms the basic building block used in the construction of graphs with special 
connectivity properties and small number of edges (see, e.g., Chung 1978). 

An example of a graph of this type is an n-superconcentrator which is a directed 
acyclic graph with 1 inputs and n outputs such that for every 1 <r <n and every 
two sets A of r inputs and B of r outputs there are r vertex disjoint paths from the 
vertices of A to the vertices of B. A family of linear superconcentrators of density 
dis a set {Gnr}” ,, where G, is an n-superconcentrator with < (d +0(1))n edges. 
Superconcentrators, which are the subject of an extensive literature, are relevant 
to computer science in several ways. They have been used in the construction 
of graphs that are hard to pebble (see, e.g., Paul et al. 1977), in the study of 
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lower bounds (Valiant 1976), and in the establishment of time space tradeoffs for 
computing various functions (see, c.g., Tompa 1980). 

It is not too difficult to prove the existence of a family of linear expanders (and 
hence a family of linear superconcentrators) using probabilistic arguments. This 
was first done by Pinsker (1973) (see also Pippenger 1977 and Chung 1978). How- 
ever, for applications, an explicit construction is desirable. Such a construction is 
much more difficult and was first given in the elegant paper of Margulis (1973) who 
used, surprisingly, some results of Kazhdan on group representations, to construct 
explicitly a family of linear expanders of density 5 and expansion c, for some c > 0. 
An outline of his method is given below. However, Margulis was not able to bound 
c strictly away from 0. Gabber and Galil (1981) modified Margulis’ construction 
and were able to give, using Fourier analysis, an effective estimate for c. Better 
expanders were found later, by several authors, using various methods that are 
discussed briefly at the end of this section. 


3.1. Eigenvalues and expanders 


There is a close correspondence between the expansion properties of a graph and 
the eigenvalues of a certain matrix associated with it. Specifically, let G = (V, E) 
be a graph and let A;; = (@u)yny be its adjacency matrix given by a, =1 if 
uv € E and a, = 0 otherwise. Put Q,; = diag(deg(v)),.,, - Ac, where deg(v) is 
the degree of the node v € V, and let A(G) be the second smallest eigenvalue of 
Q.. The following simple result is proved in Alon and Milman (1984, 1985). The 
proof uses elementary linear algebra (Rayleigh’s principle). Similar results appear 
in Tanner (1984), Jimbo and Maruoka (1985) and Buck (1986). 


Theorem 3.1. [f G is a graph with n nodes, maximum degree d and A = A(G), then 
G is an (n,d,c)-expander, where c = 2A/(d + 2A). 


Therefore, if A(G) is large then G is a good expander. The converse is also true, 
though less obvious, and is given in the following result which is in some sense the 
discrete analogue of a theorem of Cheeger on Riemannian manifolds. 


Theorem 3.2 (Alon 1986a). If G is an (n,d,c)-expander, then (G) > c*/(4 + 2c’). 


These two theorems supply an efficient algorithm to approximate the expanding 
properties of a graph and show that it is enough to estimate A(G) in order to get 
bounds on the expansion coefficient of G. 


3.2. Constructing expanders using Kazhdan’s property (T) 


Definition. A discrete group H has property (T) if for every set S of generators 
of H there exists an ¢ > 0 such that for every unitary representation 7 of H in 
V =V, that does not contain the trivial representation, and for every unit vector 
y €V, there exists an s € S such that |(7(s)y,y)| <1 —«. 


Kazhdan (1967) defined property (T) for the more general class of locally com- 
pact groups. For our purposes the definition for discrete groups suffices. 
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Margulis, in a beautiful paper, used some of Kazhdan’s results on property (T) to 
construct a family of linear expanders. A somewhat simpler proof for the expansion 
properties of graphs constructed by a slightly more general construction is given in 
Alon and Milman (1985). We outline this construction below. Recall the definition, 
given in section 2, of a Cayley graph G = G(H, 5), where H is a finite group and 
6 is a set of gencrators of 17,5-—5 ',1¢ 6. 

For n > 3, let Se(n, Z) denote the group of all n by n matrices over the integers 
Z with determinant 1. There is a well-known explicit set B, of two generators of 
Sf(n,Z) (see, e.g., Newman 1972). Put S, = B, UB,' (\S,|= 4). Let SL(n,Z;) be 
the group of all m by n matrices over the ring of integers modulo i with determinant 
{, and let 4,” : SL(n, Z) > SL(n,Z;) be the group homomorphism defined by 
6,” ((ars)) = (ars (modi)). Also define G*” = G(Se(n,Z;), ${ (Sn). 


Theorem 3.3 (Kazhdan 1967). For each n > 3, Sl(n,Z) has property (T). 


It is not too difficult to check that the adjacency matrix A” of G!” is 
Yses, 7° ¢;”(s), where 7 is the left regular representation of Se(n,Z;). By 
Rayleigh’s principle, aa”) is precisely the minimum of |S,| ~ (Ay, y), where 
y ranges over all unit vectors in W which is the space of all vectors whose co- 
ordinates sum is zero. Combining these two facts with Theorem 3.4 and the fact 
that 7ro 6” is a unitary representation of S€(n,Z) in W that does not contain 
the trivial representation, we conclude that for every fixed n > 3 there is an e > 0 


such that (a) > e€ for every i. Hence {GM }x, is a family of linear expanders 
of density 4. 


3.3. Improved constructions 


Various authors have modified and improved Margulis’ first construction. Angluin 
(1979) showed how to construct a family of linear expanders of density 3. Gabber 
and Galil were the first to obtain a family of linear expanders with an effective 
estimate on their expansion coefficient. This enabled them to construct supercon- 
centrator of density 271.8. Other constructions appeared in Schmidt (1980), Alon 
and Milman (1984, 1985), Jimbo and Maruoka (1985) and Buck (1986). The Jimbo- 
Maruoka method uses only elementary but rather complicated tools from linear 
algebra. The other authors apply either results from group representations or from 
harmonic analysis. Some of these constructions supplied better superconcentrators 
of densities 261.5 (Chung 1978), 218 (Jimbo and Maruoka 1985), and 122.7 (Alon 
et al. 1987). 

More recently, Lubotzky et al. (1986, 1988) and independently Margulis (1988), 
applied some results of Eichler and Igusa on the Ramanujan conjectures and con- 
structed, for every prime p = 1(mod4), an infinite family of d= (p + 1)-regular 
graphs G,, with A(G;) > d — 2V/d — 1. It is not difficult to show (see Alon 1986a or 
Lubotzky et al. 1988), that this is best possible. Let us describe some of these strong 
expanders, called Ramanujan Graphs, summarize their properties and discuss, very 
briefly, their connection to the Ramanujan conjectures. 
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Let p and q be unequal! primes, both congruent to 1 modulo 4. As usual, denote 
by PGL(2,Z,) the factor group of the group of all two by two invertible matrices 
over GF(q) modulo its normal subgroup consisting of all scalar matrices. Similarly, 
denote by PSL(2,Z,) the factor group of the group of all two by two matrices 
over GF(q) with determinant 1 modulo its normal subgroup consisting of the two’ 
scalar matrices (/,) and (| °,). ‘The graphs we describe are (p + 1)-regular Cayle 
graphs of PSL(2,Z,) in case p is a quadratic residue modulo q and of PGL(2, Z,) 
in case p is a quadratic nonresidue. A well-known theorem of Jacobi asserts that 
the number of ways of representing a positive integer n as a sum of 4 squares is 


ri(n)=8)od. 


din 
Ald 
This easily implies that there are precisely p +1 vectors a = (ao, a), 42,43), where 
ay is an odd positive integer, a,,@2,a3 are even integers and atai+a, +a =p. 
Associate each such vector with the matrix y, in PGL(2,Z,) where 


_ ag + iay a) + ia 
Me —a, + ia; ag — ia, 7 


and i is an integer satisfying i? = —1(modq). If p is a quadratic residue modulo q, 
all these matrices lie in the index two subgroup PSL(2,Z,) of PGL(2, Z,). In this 
case, let G? denote the Cayley graph of PSL(2,Z,) with respect to these p+1 
matrices. If p is a quadratic non-residue modulo q, let G?? denote the Cayley 
graph of PGL(2,Z,) with respect to the above matrices. The properties of the 
graphs G?4 are summarized in the following theorem. A detailed proof appears 
in Lubotzky et al. (1988). 


Theorem 3.4. (i) If p is a quadratic non-residue then G?4 is a bipartite d = (p + 1)- 
regular graph with n = q(q’ — 1) nodes. Its girth is at least 4log, q — log, 4 and its 
diameter is at most 2log,n+2log, 2+ 1. The eigenvalues of the adjacency matrix 


of G?4, besides (p+1) and —(p +1), are all in absolute value at most 2,/p. In 
particular 


MG?) > pl 2p ad -2Wdo 1, 


(ii) If p is a quadratic residue modulo q, then G?4 is a d = (p + 1)-regular graph 
on n= q(q? ~ 1)/2 nodes. Its girth is at least 2log,q and its diameter is at most 
Zlog,n+2log,2+1. The maximum independent set of nodes of G?4 is of size at 
most (2,/p/(p+1+2,/p))n and its chromatic number is at least 1 + (p +1)/2,/p. 
Each eigenvalue of the adjacency matrix of G?4, besides p +1, is, in absolute value, 
at most 2,/p. Hence \(G?4) > p+1—2/p=d—2Vd—-1. 


Most of the properties of the graphs G?7 stated above are consequences of 
their spectral properties, i.e., the bound on the absolute values of their cigen- 
values. These bounds are obtained by applying results of Eichler and Igusa con- 
cerning the Ramanujan conjectures (see Ramanujan 1916). Eichler’s proof makes 
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use of Weil’s “Riemann hypothesis for curves” mentioned in the next section. 
These results supply good approximation for the number of ways a positive in- 
teger can be represented as a sum of four squares of a certain type. Specifi- 
cally, let r,(2) denote the number of integral solutions of the quadratic equa- 
tion x7 + 4q?x} + 4q?x% + 4q?x2 = n. Jacobi’s theorem mentioned above, determines 
r\(n) precisely. For general q and for n = p*, k > 0, there is no precise formula 
but the Ramanujan conjectures (which is known in this case by Eichler and Igusa’s 
results) states that for every ¢ > 0 as k tends to infinity, 


ry(p*) = C(p') +O, (py , (3.1) 


where C(p*), which is the main term, has an explicit known formula. In order to 
establish the spectral properties of the graph G?4, one obtains an expression for 
rq({p*) in terms of the eigenvalues of G?4, Comparing this expression with (3.1) 
the desired bounds for the eigenvalues follow. ‘The details appear in Lubotzky et 
al. (1988). 

The Ramanujan expanders are useful in constructing efficient sorting and fault- 
tolerant networks. In particular, they supply superconcentrators of density 58. Al- 
though this is much better than all the previous constructions it is still worse than 
the best non-constructive bound, due to Bassalygo (1981) who showed, using prob- 
abilistic arguments, that there are superconcentrators of density 36. 


4. Character sums and pseudo-random graphs 


4.1. Weil theorem and character sums 


Let f(x,y) be a polynomial of total degree d over the finite field GF(q), with 
N zeros (x,y) in GF(q) x GF(q). Suppose f(x,y) is absolutely irreducible, i.e., 
irreducible over every algebraic extension of GF(q). The famous theorem of Weil 
(1948), known as the Riemann hypothesis for curves over finite fields, states that 


IN —q| < 2g /q+ei(d) , 


where g < Cs) is the “genus” of the curve f(x, y) = 0, and c;(d) depends only on 


This highly nontrivial theorem, which was already mentioned in the previous sec- 
tion while briefly discussing the Ramanujan conjecture, is one of the fundamental 
results in modern number theory. Weil’s original proof relied heavily on several 
ideas from algebraic geometry. Twenty years later, Stepanov found a more elemen- 
tary proof, related to methods in diophantine approximation, for several special 
cases. His method was extended by Bombieri and Schmidt who finally obtained an 
elementary (but complicated) proof for the general result. A detailed presentation 
of several results related to Weil’s theorem, using the Stepanov method, appeared 
in Schmidt (1976). Weil’s theorem implies several sharp estimates for character 
sums. For our purposes here we state one, whose proof by the Stepanov method 
can be found in Schmidt (1976, see Theorem 2.C’, p. 43). 


ste es ctepgage EF 
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Theorem 4.1. Let y be a multiplicative character of order m>1 of GF(q), and 


suppose f(x) has d distinct zeros in the algebraic closure of GF(q) and is not an 
mth power. Then 


SY, x(f(x)) 


x€GF(q) 


< (d—1)q'”. 


Graham and Spencer (1971) applied this theorem to establish a pseudo-random 
property of a properly defined tournament. This is described below. 


4.2. Schiitte’s problem 


A tournament T,, on n nodes is an orientation of the complete graph on 7 nodes. 
For two nodes x,y of T,,, we say that x dominates y if the edge between x and y 
is directed from x to y. K. Schiitte asked in 1962 whether for every k > 0 there 
is a tournament T = T,,,) such that for every set S of k nodes of T there is a 
node y which dominates all elements of S. Erdés (1963) showed, by probabilistic 
arguments, that for each k such a T,,4), with O(k? . 2*) nodes, exists. Graham and 
Spencer (1971) gave an explicit construction of such a tournament with O(k?2?*) 
nodes. In fact, their construction was not new; these tournaments, known as the 
quadratic residue or Paley tournaments, were studied before. The novelty was the 


application of Theorem 4.1 that showed that these tournaments have the desired 
properties. 


4.3. The construction 


Let q be an odd prime power congruent to 3 modulo 4. Let T, be a tournament 
whose nodes are the elements of the finite field GF(q), and an edge is directed 
from x to y if and only if x — y is a square in GF(q). Since —1 is not a square, T, 
is a well-defined tournament. 


Theorem 4.2. If g > k? -2°4-? then for every set S of k nodes of T, there is a node 
y which dominates all elements of S. 


Proof. Let A = {a),@2,...,a,} be a set of k distinct elements of GF(q). Let y be 
the quadratic character on GF(q), i-e., for y € GF(q), y(y) =1 if y is a nonzero 
square in GF(q), x(y) =—1 if y is nonsquare, and y(0) = 0. We must show that 
there is a y € GF(q)\A such that y(y — 4;) = 1 for 1 <i <k. Define 


k 
sA= >> J] (+x -4))). 
yeGh(q)-A j=l 


Clearly, it is enough to show that g(A) > 0, since in this case there isa y = yo GA 
such that []j_, (1 + x(y — a;)) > 0. To show that g(A) > 0, define h(A) by 


k 
A(A)= SY) [] (+x0-4)) 


yeGF(q) fol 
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and notice that 
k ok 
g(A) = h(A) — S°T] (1+ x(a — a) « (4.1) 
i=1 j=l : 
Expanding the expression for h(A) we obtain 


k 
h(A) = > 1+ > Sox ~4j) += 


yeGF(q) ye€GF(q) j=1 


+> 2 xy ~ 4j,)--- xy = aj) 


YEGF(g) Ufc fg ck 


<k 


The first two terms here are q and 0, respectively. By Theorem 4.1 


a yay): “xy ~ @),) < (4) (0-4 


yeGE(q) ise 
\h A) - \< 1/2 . k ~V= 1/2 (k ~2 k-th 
(A)-a\ <q > o) SD ag (K =~ 22 +1). 


and hence 


Thus 
h(A) > q— ((k—2)2%? 41)! 


Using (4.1) one can easily check that H(A) — g(A) < 2*~! and thus, if g > k22?4-2, 
g(A) > 0. This completes the proof. 


4.4. The pseudo random properties of Ty 


An easy variation of the last proof shows that the tournament 7, constructed 
above has the following property: For every two disjoint sets of nodes A,B of Tg, 
with |A|+|B| =, the number of nodes y of 7, that dominate all members of A 
and are dominated by all members of B is 


se O(K Gg"? +k 2), 


Thus, for say, k < {logg, this number is very close to (q — k)/2* which is the 
expected number of such nodes in a random tournament on g nodes. This easily 
implies that the number of labeled subtournaments of 7, isomorphic to any given 
labeled tournament on k nodes is very close to 2/2. Thus T, resembles a random 
tournament on q nodes. 

As observed by Bollobas and Thomason (1981), a similar construction supplies 
undirected pseudo-random graphs. These are called Paley Graphs. Suppose q is an 
odd prime power, congruent to | modulo 4, and let G, be the graph whose nodes 
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are the elements of GF(q), where x and y are adjacent if x — y is a square in GF(q). 
As before, one can show, using Theorem 4.1, that for every two disjoint sets A, B 
of nodes, with |A]+|]B] =k, k < llogq, the number of nodes y adjacent to all 
elements of A and nonadjacent to every element of B is very close to q/2*. This 
implies, of course, that G, contains all graphs on k vertices as induced subgraphs. 

Jt seems more difficult to construct explicitly large graphs that do not contain 
some specified small induced subgraphs. In fact, the best known open problem 
concerning explicit constructions is a problem of this type. This is the problem 
of obtaining constructive lower bounds for the usual diagonal Ramsey numbers. 
Specifically, we want to describe explicitly, for every k, a graph with c* nodes 
that contains neither a clique of size k, nor a stable set of size k, where c > 1 is a 
constant independent of k. The best known result in this direction is that of Frankl 
and Wilson (1981) who constructed such a graph with exp(((log” k/ log log k)) 
nodes. It may be true that for primes q the Paley graphs G, are better examples, 
but at present a proof of this, which would have several new number-theoretic 
consequences, scems hopeless. 


5. Real varieties and sign patterns of polynomials 


5.1. The number of connected components 


In this section we describe several combinatorial applications of the known esti- 
mates for the number of connected components of real varieties or semivaricties. 
Such estimates were obtained by several authors, and can be found, among other 
places, in Oleinik and Petrovskii (1949), Milnor (1964), Thom (1965) and Warren 
(1968). For our purposes all these existing bounds suffice. To be specific, we state 
two of them. 


Theorem 5.1 (Milnor 1964). Let V be a real variety in R°, defined by the solution 
set of the real polynomial equations 


file,---,xe) =0, i=1,....m, 


and suppose the degree of each polynomial f; is at most k. Then the number c(V) 
of connected components of V is at most k- (2k —1)©). 


Theorem 5.2 (Warren 1968). Let P\,...,Pm be real polynomials in € variables, 
each of degree k or less. Let V be the set {x € R°: Pj(x) #0 forall 1 <i < m}. Then 


the number c(V) of connected components of V does not exceed 2(2k)! YY _42!("). 
In particular, if m > € > 2 then 


c(V) < (4ckm/e)! . 


We note that Theorem 5.1 can be applicd to deduce upper bounds for the num- 
ber of connected components of the solution set of a system of algebraic inequal- 
ities, by expressing such a set as a projection of a variety in a higher dimension. 
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5.2. Lower bounds for algebraic decision trees 


In an elegant paper, Steele and Yao (1982) applied Milnor’s result stated above to 
obtain lower bounds for the height of algebraic decision trees. Their method was 
modified and extended by Ben-Or (1983). We outline this method below. Related 
interesting results appear in Bj6rner et al. (1992). 

For W CR‘, the membership problem for W is the following: 

Given x = (x1,...,x) € R’, determine if x ¢ W. Thus, for example, the ¢-element 
distinctness problem, which is the problem of deciding whether @ given real num- 
bers are all distinct, is just the membership problem for 


W = ¢(xy,...,%/) ER’: Il (x; ~ xj) #0 
i<i<j<e 
We are interested in algorithms for solving the membership problem for W that 


allow arithmetic operations and tests. More formally, an algebraic decision tree is 
a binary tree T with a rule that assigns: 


(a) To any node v with one son, an operational instruction of the form: 


fo = fo, O fo, oF fo =Cofy, , 


where vu; is an ancestor of v in T, or fy, € {x1,...,Xe}, 0 € {+,-, x,/} and ce R, 
(b) To any vertex uv with two sons, a test instruction of the form f,, > 0 or f,, > 0 
or f,, = 0, where v; is an ancestor of v or fu, € {x4,-.-, Xe}. 


(c) To any Icaf an output Yes or No. 

For any input x € R‘, the algorithm traverses a path P(x) in T from the root, 
where at each node, the corresponding arithmetic operation is performed or a 
branching is made according to the test. When a leaf is reached, the anwser Yes 
or No to the problem is returned. We note that one can allow more algebraic 
operations (like square roots, etc.), but the treatment is similar. 


Theorem 5.3 (Ben-Or 1983). Suppose W C RY, and let T be an algebraic decision 
tree that solves the membership problem for W (i.e. for each x € R’, P(x) ends in 
a “Yes” leaf iff x © W). If h is the height of T then 


gh 2 tt > N ; 


where N is the number of connected components of W. 


Proof. The main tool in the proof of this theorem is Theorem 5.1 stated above. 
One first observes that every “Yes” leaf corresponds to a subset of R° that is 
a projection of a variety defined by a system of at most A quadratic equations 
and inequalities in ¢ +h variables. Using Theorem 5.1 one can show that such a 
subset can have at most 3°** connected components. Each such component must 
be contained in some connected component of W, and since the number of leaves 
of T is at most 2", and all components of W must be covered by the “Yes” leaves 
we have 24.3%" >N. 


2 En = ENR te eA A 
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As an example for applying Theorem 5.3, notice that 


W = {nome TL & wool 


t<icj<f 


has precisely @! connected components, corresponding to the ¢! possible order- 
types of x1,...,x~. Thus, any algebraic decision tree that solves the @-elements 
distinctness problem has height Q(¢ log 2). This is clearly best possible, as the ¢- 
element distinctness problem can be solved by sorting the € elements and then 
comparing all pairs of adjacent elements in the sorted order. 


5.3. Sign patterns of real polynomials 


For further applications of Theorems 5.1 and 5.2, it will be convenient to derive a 
more combinatorial corollary, dealing with sign patterns of real polynomials. 

Let Pj = Pj(x1,.--,X¢), j= 1,.--,m be m real polynomials. For a point c € R’, 
the sign-pattern of the P;’s at ¢ is the m-tuple (&,...,ém) € (—1,0,1)", where 
ej = SignP;(c). Let s(P,, P2,...,Pm) denote the total number of sign-patterns of 
the polynomials P;,P2,...,Pm, aS ¢ ranges over all points of R’. Similarly, let 
5(P1, P2,..-;Pm) denote the total number of sign-patterns consisting of vectors 
with {+1} coordinates. Clearly, s(P), P2,..-,Pm) <3" and S(Py, Ps,...,Pm) <2”. 
However, one can apply Theorem 5.1 or Theorem 5.2 to bound these numbers by 
a function of @ and the degrees of the polynomials P,, P2,..., Pm. Indeed, suppose 
the degree of each P; does not exceed k. Put V = {x ER: P(x) #0 forall 1< 
i <m}. Clearly s(P,,..., Pm) is bounded above by the number c(V) of connected 
components of V. This, together with Theorem 5.2, gives the following result (for 
@ > 2; for ? = 1 it is trivial). 


Proposition 5.4 (Warren 1968). Let P,,...,Pm be m real polynomials in € real 
variables, and suppose the degree of each P, does not exceed k. If m > then 
5(P,, Po,. : ai Pai) S (4ekm/e)’. 


It is not too difficult to obtain a similar bound for the total number 
s(P|,P2,...,Pm) of sign-patterns. Indeed, let C C R’ be a set of cardinality |C| = 
s(P;, Pz,...,Pm) representing all sign-patterns of the polynomials P,, P2,..., Pm. 
Define « > 0 by 


€= ;min{|P)(c)|:¢€ C,1 <j <m and P,(c) #0} . 


Now put V={xeR: P(x)—e40 and Pi{x)+e 40 forall 1<i<m}. 
Clearly C C V and one can easily check that each two distinct points c,c’ € C 
lie in distinct connected components of V. Hence s(P;,..., Pm) ={C| does not 
exceed the number of connected components of V. In view of Theorem 5.2, we 
conclude the following. 


Proposition 5.5. Let P,,..., Pm be m real polynomials in € real variables, and sup- 
pose the degree of each P; does not exceed k. If 2m>E€ then s{Pj,...,Pm) < 
(8ekm/e)*. 
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A similar estimate can be obtained from Theorem 5.1. 


5.4. The number of polytopes and configurations 


If (Po, Pi,..., Pa) is a sequence of d+ 1 points in R4, with P; = (xj), ..-, Xia) for each 
i, we say they have positive orientation, written Po... Pq > 0, if det(xij)o<ij-a > 0 
where xj) = | for each i. The conditions Py...P4y <0 and Py...Py = 9 are defined 
similarly. The order type of a configuration C of a labeled points P;, P2,..., Pr 
in R@ is a function w from the set of all (d+ 1)-subsets of {1,2,...,n} to 
{0,+1}, where for S = {io, ty, .--yta} with 1< ig <i See < la< n,w(S) = +1 if 
Pi, Pi, > 0, W(S) = —1il Py, ...P;, < 0, and w(S) = Oif P;,...P;, = 0. The config- 
uration is simple if w(S) #4 0 for every such S. Notice that w(S) is just sign det(x;, ;), 
0<k,j <d, where P;, = (Xi,1,---,Xi,a) and x;,9 = 1 for 0 <k <d. The order type 
of a configuration C of points is sometimes known as the oriented matroid struc- 
ture determined by C. Let ¢(n,d) denote the number of distinct order types of 
configurations of 1 labeled points in R4, and let ¢,(n,d) denote the number of or- 
der types of simple configurations of n labeled points in R¢. Goodman and Pollack 
(1986) applied Milnor’s theorem 5.1 to show that t,(n,d) <n@4+!", As it is not 
too difficult to show that for every fixed d > 2, t,(n,d) > n+)" | as n tends to 
infinity, this upper bound is not far from the truth. In Alon (1986b) it is shown 
that 1+)" is the correct order of magnitude of both t,(n,d) and t(n,d). This 
is, in fact, an immediate consequence of Proposition 5.5. 


Proposition 5.6. For every fixed d > 2, as n tends to infinity, 
ts(n,d) <t(n,d) < nian 


Proof. Obviously ¢(n, @) is just the number of sign patterns of (,",) polynomials of 
degree d in the dn real variables (x;;,..., ig), which are the coordinates of the ith 
point (1 <i <n). The polynomials are just all the determinants det(x;,;),0 <k,j < 
d, where x;,9 = 1 for all k and 1 cin <i) <---<ig <n. The result now follows 
from Proposition 5.5. O 


The same computation shows that for every n and d 
t.(n,d) <t(n,d) < 270) | 


Next we consider the number of combinatorial types of convex polytopes. 

Let c(n,d) denote the number of (combinatorial types of) d-polytopes on n 
labeled vertices and let c,(n,d) denote the number of simplicial d-polytopes on 
n fabeled vertices. The problem of determining or estimating these two func- 
tions (especially for 3-polytopes) was the subject of much effort and frustration of 
nineteenth-century geometers. Although it follows from Tarski’s theorem on the 
decidability of first order sentences in the real field that the problem of computing 
c(n,d) is solvable, it seems extremely difficult to actually determine this number 
even for relatively small n and d. Both Cayley and Kirkman failed to determine 
c(n,3) or c;(n, 3) despite a lot of effort. Detailed historical surveys of these attempts 
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were given by Briickner and Steinitz (cf. Griinbaum 1967, pp. 288-290), and the 
asymptotic behaviour of c(s,d) and ¢,(n,d) is known only for d <3 orn <d+3. 
It is thus pleasing to note, following Goodman and Pollack (1986) and, later, Alon 
(1986b) that Proposition.5.5 supplies immediately upper bounds for c,(n,d) and 
for c(n,d). This follows from the fact that the order type of a configuration that 
spans R@ determines which sets of its points lie on supporting hyperplanes of its 
convex hull. Hence, the order type of a configuration of a set of 1 points in R¢@ 
which is the set of vertices of a convex polytope P determines its facets and its 
complete combinatorial type. Thus Proposition 5.6 and the paragraph following it 
imply the following result. 


Proposition 5.7. For every fixed d, 
c(t, d) <c(n,d) < nto? 


Furthermore, the total number of polytopes of any dimension on n points is at most 
pate! e potytop 'y Pp 
Qn +O(rr ): 


Although this proposition is an immediate corollary of the known bounds for 
sign-patterns of polynomials, it improves considerably the previously best known 
bound which was n"”), We note also that one can show that for every n > 2d, 


= nd/4 
cond) > (* “) 


d 


5.5. Ranks of sign matrices 


The sign-pattern of an m by n matrix A with nonzero entries (4;;) 1 <i<m,1<j<n iS an m 
by 2 matrix Z(A) = (x;;) of 1, --1 entries where z,;; = sign a;;. For an m by 1 matrix 
Z of 1, —1 entries, let r(Z) be the minimum possible rank of a matrix A such that 
Z(A) = Z. Define r(n,m) = max{r(Z): Z is an m by n matrix over {1,—1}}. The 
problem of determining or estimating r(n,m), and in particular r(n,m), was raised 
by Paturi and Simon (1984). They observed that r(n,n) > |log,n| and raised the 
question if one can prove a superlogarithmic lower bound for r(m,7). As shown in 
their paper, this would supply lower bounds for the maximal possible unbounded- 
error probabilistic communication complexity of a Boolean function of 2p bits. This 
question is answered in Alon et al. (1985) where it is shown, in particular, that 


8 eam heaih, 
and that if m/n? — oo and (log,m)/n — 0 then 


r(njm) = G + o(1)) n. 


(The bounds here are slightly better than those that actually appear in the above 
Paper.) 
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The upper bounds are proved by combining some simple geometric, combina- 
torial and probabilistic arguments. The lower bounds can be deduced, by a simple 
counting argument, from Propositions 5.4 and 5.6. 

As shown in Alon et al. (1985), these results imply that the (bounded 
or unbounded)-error probabilistic communication complexity of. almost every 
Boolean function of 2p variables is between p — 4 and p. 


5.6. Degrees of freedom versus dimension of containment orders 


The dimension of a partially ordered set P is the minimum number of linear 
extensions whose intersection is P. Alternatively, it is the smallest & so that the 
elements of P can be mapped to points in R* so that x < y iff each coordinate of 
x is less than or equal to the corresponding coordinate of y. 

Let / be a family of sets. We say that a partially ordered set P has an S- 
containment representation provided there is a map f : P — ¥ such that x < y iff 
F(x) C f(y). In this case we say that P is an Y-order. 

For example, circle orders are the containment orders of disks in the plane. 
Similarly, angle orders are the containment orders of angles in the plane, where 
an angle includes its interior. 

Note that circles admit three “degrees of freedom”: two center coordinates and 
a radius. An angle admits four degrees of freedom: the two coordinates of its 
vertex and the slopes of its rays. Further, it is known that not all 4-dimensional 
posets are circle orders nor are all S-dimensional posets angle orders. These are 
confirming instances of the following intuitive notion: Jf the sets in S admit k 
degrees of freedom, then not all (k + 1)-dimensional posets are S-orders. 

Let us briefly show, following Alon and Scheinerman (1988), how the estimates 
for the number of sign patterns of real polynomials, supply a precise version of this 
intuitive principle. We say that the sets in ¥ have k degrees of freedom provided: 

(1) Each set in # can be uniquely identified by a k-tuple of real numbers, i.e., 
there is an injection f : f — R*. 

(2) There exists a finite list of polynomials p1, p2,...,p; in 2k variables with the 
foHowing property: If S, 7 € FY map to (4,...,xx),(1,---, 4%) € R*, respectively, 
then the containment S Cc T can be determined based on the signs of the values 
Pi(Xis ++ Xe Vy Ve) for 1 <j <l. 

For example, let us consider disks in the plane. Suppose we have two disks C; 
and C) with centers and radii given by x;,y;,7; @ = 1,2). One checks that we have 
C, C Cz iff both the following hold: 


(x1 — x2)? + (i — y2)? — (rn — 2)? <0, 
ry) —7m <0. 


Thus the family of circles in the plane admits three degrees of freedom. Similarly, 
the containment of one angle in another can be expressed in terms of a finite list 
of polynomial inequalities. 


Sih Sa easter gal: 
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Theorem 5.8. Let ¥ be a family of sets admitting k degrees of freedom. Then the 
number of S-orders on n labeled points is at most — 


1+0(1))ker! 
2! +0(1))kn logan : 
as n tends to infinity. 


Proof. Let %, denote the family of S-orders on {1,...,n}. For each n-tuple of 
sets in S,(S;,...,5,) we have a (potentially) different poset depending on the sign 
pattern of r = 2(5)t polynomials in @ = nk variables which have some maximum 
degree d (which is independent of n). Hence by Proposition 5.5, 


16ed(")r]"" 
KA < | | ee [O(t)n\"* = DU +0(1))ka logan ; Oo 


Denote by P(n,k) the number of posets of dimension at most k on n Jabeled 
points {1,2,...,”}. By a simple construction, one can show that for every fixed k 


lim log P(n,k)/(kKnlogn) = 1. 
This and the previous proposition imply the following. 


Corollary 5.9. Let S be a family of sets admitting k degrees of freedom. Then there 
exists a (k + 1)-dimensional poset which is not an S-containment order. 


6. The Chevalley~Warning theorem, abelian groups and regular graphs 


The classical theorem of Chevalley and Warning, that deals with the number of 
solutions of a system of polynomials with many variables over a finite field, is the 
following. 

Theorem 6.1 (see, e.g., Borevich and Shafarevich 1966 or Schmidt 1976). For j = 
1,2,...,” let Pj(x1,..-,Xm) be a polynomial of degree r; over a finite field F of 


characteristic p. If i rj <m then the number N of common zeros of P,,...,Pn 
(in F") satisfies 


N =0(modp) . ~ 
In particular, if there is one common zero, then there is another one. 


Proof. The proof is extremely simple; clearly, if F has q clements then 


N= » II (1 - Pj (x1,-- -Xm)t') (mod p) (6.1) 


Aye Xmek fel 


By expanding the right-hand side we get a linear combination of monomials of the 
form 
m 


Te with Yas (q-1 Sor <(q -im. 
f=l 


is] 
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Hence, in each such monomial there is an i with kj < g — 1. But then in F = GF(q), 


scr xi = 0, implying that the contribution of each monomial to the sum (6.1) is 
0(mod p), completing the proof. oO 


In this section we discuss some applications of this theorem to combinatorial 


problems in abelian groups, extremal graph theory and the theory of finite affine 
spaces. 


6.1. Combinatorial problems in abelian groups 


For a finite abelian group G, define s = s(G) to be the smallest positive integer such 
that, for any sequence g), 82,...,gs of (not necessarily distinct) elements of G, there 
isan @ 4/7 C {1,...,s} such that }“{g;: i € /} =0. The problem of determining 
s(G) was proposed by H. Davenport in 1966, and is related to the study of the 
maximal number of prime ideals in the decomposition of an irreducible integer in 
an algebraic number field whose class group is G. Olson (1969a) determined s(G) 
for every p-group G = Zye, © --- @ Zye,. Clearly 


s(G) > 14 Sn" =) 
i=l 


for let x,,...,x, be a basis for G, where x; has order p“, and consider the sequence 
of length 3~*_,(p% — 1) in which each x; occurs p“ — 1 times. No subsequence here 
has sum 0. Olson gave a charming proof of the following. 


Theorem 6.2. s(Z,e; @ --- ® Zpe,) = 14+ S~_,(p% — 1). 


For the case e; = --- =e, = 1 this can be easily deduced from the Chevalley— 
Warning theorem as follows. Let g1,g2,...,g, be a sequence of elements of 
G = (Z,)’, where s > r(p — 1) and put g; = (gi1, 812, -.., 8ir). Consider the following 
system of r polynomials in s variables over GF(p); 


Ss: 
eo oe =0, j=1l,...,r. 
i=l 


Since s > r+ (p — 1) and x, =--- =x, =O is a trivial solution, there is a nontriv- 
ial solution (z;,...,zs). Put J = {i: z; #0} and observe that 5*{g;: i € J} =0 to 
complete the proof (for the case e; = --- =e, = 1). 


The general case can be proved by generalizing the proof of the Chevalley— 
Warning theorem. 

Olson’s original proof is different and is based on the fact that the ideal of 
nilpotent elements in the group-ring of a p-group over Z, is nilpotent. Since this 
proof is short and elegant, we present it in full. 


Proof of Theorem 6.2. Let G be the finite abclian p-group with invariants 
p",p®,...,p’, and let us use multiplicative notation for G. Let R be the group ring 
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of G over Z,. Suppose k > 1+ >7)_.;(p — 1) and let gy, g2,..-,g% be a sequence of 
k members of G. We claim that in R 


(1 — gi) (tl —ga)--- I — gx) = 0- (6.2) 


Indeed, let x,,x2,...,x, be the standard basis for.G, where the order of x; is p*. 
Since each g; can be written as a product of the elements x;, a repeated application 
of the identity 1 — uv = (1 — u) + 2(1 — v) enables us to express each expression of 
the form 1 — g, as a linear combination (with coefficients in R) of the elements 
1 —x;. Substituting into (6.2) and applying commutativity we conclude that the 
left-hand side is a linear combination of elements of the form 


Ila — x)", where Ski =k> Se" -1). 
i=] i=l i=1 


Hence, there is an i with k; > p* and since in R, (1 — x;)P" =1 — xf" = 0 this 
implies that (6.2) holds as claimed. 
By interpreting (6.2) combinatorially we conclude that there is some nontrivial 


subsequence of g1,...,9, that has product |, since otherwise, the coefficient of 1 
in the above product will be nonzero. Hence s(G) = | + $77_.,(p% — 1), as needed. 
Oo 


We note that if G=C, @---@C;, is the direct sum of cyclic groups C; of orders 
|C;| =c;, where e¢\ci1, then s(G) > b+ 7) ,(c; — 1), and this inequality can be 
strict. Several interesting generalizations of Olson’s results (including an upper 
estimate for s((Zn)")), appear in Baker and Schmidt (1980) and in Van Emde 
Boas and Kruyswijk (1969). It is, however, not known if the equality s(Z,)”") = 
m(n—1)+1 holds for all m and n. 

Erdés et al. (1961), showed that for any sequence g),82,...,82n-1 Of elements 
of a finite abelian group of order x, there exists a set 1 C {1,...,2n—1} of x 
indices such that 5>{g;: i¢ 7} =0. The first (and main) step of their proof is to 
prove the above when G = Z, is the cyclic group of order p, where p is a prime. 
Although the proof in this case is an easy consequence of a special case of the 
Cauchy—Davenport lemma (see chapter 20) it is interesting to note that this fact 
can also be derived from the Chevalley-Warning theorem as follows. Consider the 
following system of two polynomials in 2p — | variables over GF(p): 


2p-} 
> gist =0, 
i=l 
2p-1 
So xf | 0 
i=1 
Since 2(p 1) <2p- Land x, = x2 =-+- = x2, ; = Vis asolution, Theorem 6.1 im- 


plies the existence of a nontrivial solution (z,,...,Z2» 4). Since in GF(p), y? |= 1 
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if y #0 and 0 | =0, J = {i: z; 0} satisfies S*{g;: i € J} =0 and |/| =p, com- 
pleting the proof. Notice, also, that the above result also follows from Theorem 
6.2 by considering the 2p — 1 elements (g;,1), (g2,1),---,(8ap-1,1) in Zp ® Zp. 

The following generalization of the Erdés-Ginzburg—Ziv theorem was proved 
by Olson (1969b). 


Theorem 6.3. Let 1] = GK be the direct sum of the abelian groups G and K of 
orders |G| = n and |K| =k, where k\n. If hy, ho, ...,Ayrg_s is a Sequence ofn+k —1 
elements of H, then there is a set 6 #1 {1,2,...,.n+k —1} of indices such that 


This theorem can also be deduced from Theorem 6.1, together with some of the 
ideas of Olson (1969b). It implies the previous statement by taking K to be the 
cyclic group of order n and by defining h; = g;61¢6€G@K forl<i<2n—1. 


6.2. Regular subgraphs of graphs 


As shown by Alon et al. (1984), one can apply the Chevalley-Warning theorem 
or Olson's Theorem 6.2 to prove that certain graphs contain regular subgraphs. A 
graph H is q-divisible if q divides the degree of every node of H. Let f(m,q) be 
the maximum number of edges of a loopless graph G on n nodes that contains no 
nonempty q-divisible subgraph. Suppose q is a prime power, and let G = (V,E) 
be a loopless graph with |V| =n nodes and |E| =m >n-(q—1) edges. Let a? 
be the (j,i)th entry of the (node-edge)-incidence matrix of G. The vectors a“ = 
(ay, : al), 1 <i<m are elements of (Z,)", so by Olson’s Theorem 6.2 there 
exists an ¢ #/ C {1,..., mt} such that vial”: i € 1} =0(modq) for 1 < j <n. The 
subgraph H consisting of all edges whose indices lie in / is clearly q divisible. Hence 
f(",q) <n-(q—1). This estimate can be slightly improved for powers of 2, and a 
matching lower bound can be given for all n > 3. Therefore, the following holds. 


Theorem 6.4. For every odd prime power q and every n > 3, f(n,q) = (q —1)n. 
For every power of two q and every n > 3, f(n,q) =(q —1)-n— 34. 


Similarly, using the results of Van Emde Boas and Kruyswijk (1969), one can 
show that f(n,k) <c(k)-n for all k and n. The truth, however, might be that 
f(a, k) < (k -1)-2 for every n > 3 and every k. 

By Theorem 6.4, if q is a prime power and G = (V, E) is a graph with maximum 
degree at most 2q — 1 and average degree greater than 2q — 2, then G contains a 
q-tegular subgraph. Indeed, |/| > (q - 1)-{V| and hence G contains a q-divisible 
subgraph which must be q-regular since its maximum degree is smaller than 2q. 

In particular, every loopless 4-regular graph plus an edge contains a 3-regular 
subgraph. This is closely related to the well-known Berge—Sauer conjecture, which 
asserts that every 4-regular simple graph (i.e., a graph with no loops and no parallel 
edges) contains a 3-regular subgraph. This conjecture has been proved by TaSkinov 
(1982). It is, however, false for graphs with parallel edges, and hence the assumption 
“plus an edge” in the previous statement cannot be omitted. 
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Another consequence of Theorem 6.4, together with several known results in 
graph theory is that for every & and r that satisfy k > 4r, every loopless k-regular 
graph contains an r-regular subgraph. For several sharper results see Alon et al. 
(1984). : 

Erdés and Sauer (see, e.g., Bollobas 1978, p. 399) asked for an estimation of the 
maximal number of edges of a simple graph on 7 nodes that contains no 3-regular 
subgraph. They conjectured that this number is o('**) for any ¢ > 0. This con- 
jecture has been proved by Pyber (1985) by applying Theorem 6.4. Pyber showed 
that any simple graph with n nodes and at least 200nlogn edges contains a sub- 
graph with maximal degree 5 and average degree greater than 4. This subgraph 
contains, by the paragraph following Theorem 6.4, a 3-regular subgraph. A similar 
reasoning shows that there exists a constant c > 0 such that for every r > 3, every 
simple graph G with n nodes and at least c-r? -nlogn edges contains an r-regular 
subgraph. On the other hand, Pyber, Rédl and Szemerédi showed, using proba- 
bilistic arguments, that there are simple graphs with n nodes and O(log log) 
edges, that contain no 3-regular subgraphs. Thus the above result is not far from 
being best possible. 


6.3. The blocking number of an affine space 


For a prime power q and k > 0, let AG(k,q) denote the k-dimensional affine 
space over GF(q). It is not too difficult to observe that there is always a subset 
of cardinality k -(q — 1)+1 that intersects all hyperplanes. Indeed, the union of 


any k independent lines through a point intersects all hyperplanes and has this 
cardinality. 


Theorem 6.5. The minimum cardinality of a subset of AG(k,q) that intersects all 
hyperplanes is k -(q — 1) +1. 


This theorem was proved, independently, by Jamison (1977) who gave a rather 
lengthy proof for a more general result, and by Brouwer and Schrijver (1978) who 
obtained an elegant and short proof. If g = p is a prime, their proof can be short- 
ened even further by using the Chevalley-Warning theorem as follows. Suppose 
Ac AG(k, p) intersects all hyperplanes. We may assume that 0 = (0,...,0)€ A, 
and define B = A\{0}. Then B intersects all hyperplanes not through 0, i.e., for 


every 0 £ (w1,W2,..., We) € (GF(p))* there exists a b = (by,...,5,) € B such that 


wyb, +---+ wb, =1. Define F(x),-..,x%) = [peg (1 — 1x1 — +++ — byxy). Clearly 
F(wy,...,w,) = 0 for all (wy,...,w,) 4 O and F(0,...,0) = 1. Consider the follow- 
ing polynomial equation in the k - (p — 1) variables x"... “ar l<i<p-}: 

pol 


FOx?,...,27) =p-1. 


Obviously, the only zero of this equation is the trivial solution x” =Oforl< i<k, 
1<i<p-—1. By the Chevalley-Warning theorem, this implies that the degree of 
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the above polynomial, which is |B], is at least as big as the number of variables, 
which is k - (p ~ 1). Hence |A| 24 -(p — 1) +1, as needed. 

It is worth noting that neither the proof of Jamison nor the one of Brouwer 
and Schrijver imply any estimate for the analogous problem for non-Desarguesian 
planes. 


7. More polynomials 


In the last two sections real polynomials and polynomials over a finite field were 
used to derive some combinatorial results. In this section, we describe some further 
combinatorial problems, where polynomials and ideals of polynomials are applied 
for deriving certain characterization results with combinatorial consequences. 


7.1. Generators of ideals, graph polynomials and vectors balancing 


For a graph G = (V,E) on the n nodes {1,2,...,}, define the associated graph 
polynomial fg = fg(x1,---,Xn) by 


fo =]]{@i- x): ij € BE}. 


The independence number c(G) (= the maximum size of a stable set of G) is at 
most k, if and only if the polynomial f;; vanishes whenever k +1 variables are 
equal. For 0< k <n, let 1(k + 1,n) denote the ideal of the ring Z[x;,...,x,] con- 
sisting of all polynomials which vanish whenever & + 1 variables are equal. Hence, 
fo © Wk +1,n) if and only if c(G) <k. Li and Li (1981) proved the following 
“Nullstellensatz”-type result, which supplies a set of generators of [(k+1,n). In 
view of the preceding remark, this theorem supplies a characterization (though, 


maybe, not a very convenient one) for all graphs G whose independence number 
is at most k. 


Theorem 7.1. Put V = {1,2,...,n} and let C denote the set of all graphs H on 
V that consist of k node-disjoint complete graphs whose cardinalities are as equal 
as possible. Then {fyy: HEC) is a set of generators of Wk + 4,1). In particular, a 
graph G has an independence number at most k if and only if there are polynomials 
{gy: H € C) such that fg = \ {gn - fu: HEC}. 


Kleitman and Lovasz proved a similar result for graphs whose chromatic number 
is at least kK. They showed that a graph G has a chromatic number at least k if and 
only if fg belongs to the ideal generated by the polynomials of complete graphs 
of k nodes from V. Another result of this type appears in Alon and Tarsi (1992); 
the chromatic number of G is at least k if and only if fg lies in the ideal generated 
by the polynomials x*~! — 1. 

It is worth noting that, as is well known, the decision problem “Given a graph 
G and an integer k, is the independence number of G at most k?” as well as 
the corresponding problem of coloring, are both coNP-complete, and hence it is 
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not reasonable to expect to find a completely satisfactory characterization of the 
corresponding sets of graphs. 

Li and Li (1981) show how to apply Theorem 7.1 to deduce Turdn’s theorem, 
which states that the minimum possible number of edges of a graph G on n nodes, 
whose independence number is at most k, is the number of edges of a node disjoint 
union of k complete graphs of total order n whose cardinalities are as cqual as 
possible. Indeed, since fg belongs to the ideal /(k + 1,n) which is generated by the 
graph polynomials {f: H € C}, the degree of fg, which is precisely the number 
of its edges, is at least the minimum degree of a generator fu. Here all generators 
have the same degree and Turdn’s theorem follows. 

Another combinatorial result whose proof is related to Hilbert’s Nullstellensatz 
deals with the problem of balancing sets of vectors. For an even integer n, let 
K(n) denote the minimum k for which there exist +1 vectors v;,U2,...,0, Of 
dimension 7 such that for any +1 vector w of dimension n, there is ani, 1 <i<k, 
such that v;- w = 0, ic., v; is orthogonal to w. Motivated by a problem in data 
communication, Knuth showed that K() < n by the following simple construction. 
For 0 <i <n, let u; be a vector of i — 1 entries followed by n — i 1 entries. We claim 
that for any +1 vector w of dimension n, w -v; = 0 for some 1 <i <n. To see this, 

note that w-ug = —w-v, while w-u; = w-v;,; +2 for each i <n. Since w-u; = 
0(mod2) for all i, an obvious “discrete intermediate value” theorem implies that 
w-u; =0 for some i, 1 <i <n, as claimed. 

As shown by Alon et al. (1988), this construction is optimal, ie., K(m) =n for 
all even n. Let us sketch the proof of the lower bound. For simplicity, we consider 
only the case n = 0(mod4). Let U be the set of all +1 vectors of dimension n. A 
vector u € U is even if it has an even number of —1 entries, otherwise it is odd. 
Let V CU be a set of vectors such that for every u € U there is au € V with 
uv-u=0. We must show that |V| >. Let Vp be the set of all even vectors of V 
and let V; be the set of all odd vectors of V. Consider the following polynomial 
in y = (yy)... 5 Yn): 


P(y) = [e-y). 


vEVy 


Since n = 0(mod 4), uy - v2 = 0(mod 2) for all v,,u2 € U. Also, one can easily check 
that for every v;,v2 € U, uv, - v2 = 0(mod 4) if and only if both vu; and v2 are even 
or both are odd. Otherwise v; - v2 = 2(mod4). Therefore, for every even y € U, 
P(y) =0, whereas for every odd y € U, P(y) #0. Hence P(y) vanishes on the zero 
set of the ideal generated by y? — 1,y2 —1,...,y2 ~ Lyiya-+*¥n — 1. By Hilbert’s 
Nullstellensatz, a power of P belongs to this ideal. It is not too difficult (but a 
little tedious) to show that this implies that if the degree of P is less than kn then 
it vanishes igentivally, contradicting the et that P(y) £0 for every odd yc U. 
Thus deg P = |Vo| > 4n. Similarly |V,| > in and hence {V| >n, completing the 
proof that k(n) =n. A more elementary proof of a somewhat more general result 
appears in the above mentioned paper. 
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7.2. Rédei’s theorems on lacunary polynomials over finite fields 


Let f(x) = )7/_yaix! be a polynomial over a finite field GF(q). It is called lacunary 
if it is 0 or a monomial, or if there are j,k satisfying a; #0, a, #0, j +2 <k and 
Qj = ++ = ag) =O. It is called fully reducible if it is a product of linear factors 
over GF(q). 

Rédei (1973) developed a theory which enabled him to give a complete char- 
acterization of certain fully reducible lacunary polynomials over finite fields. The 
main part of this characterization is a determination of all fully reducible poly- 
nomials f(x) = >! ,a:x' over GF(q) that satisfy f’(x) #0, a, 40 and a; = 0 for 
(q¢+1)/2 <i <q.. Although the full statement of Rédei’s theorem is somewhat 
complicated we give it here, as it seems to be important and yet little known. 


Theorem 7.2. Let f(x) = >1.,aix' be a fully reducible polynomial over GF(q), 
where q = p" fora prime p > 7, and suppose that a, = 1, a, = O for (q + 1)/2<i<4q 
and f'(x) £0. Suppose, further, that f(x) # x4 — x. Then aiqyiy2 #0 and f(x) can 
be obtained as follows. Let o be +1 or —1 and let p = py < py < +--+ < py =q be in- 
tegers satisfying po|p\|---|px and (po ~ 1){(pr — 1)|--- |e — 1). Let ay,ay,..-, aK 1 
be elements of GF(p) satisfying y(a;) € {0,0} for 1 <i <k, where x is the quadratic 
character, and suppose the elements pj © GF(p;) are defined, for 0 <i < k, by 


Po=a, pr = (ay — py) PH V/P-Y, 
po = ((a2 = Py £2 po eas cabs 


Pi = ((- + (Cai — pO W/OD — pO Ord 
Jez WD 


i -1 BH) / (pi) 
Shi nSeyls a : V1) 


and p, € GF(p,) is arbitrary. Define 


Cy = (aa (epee ey a ee bl MOD on 


and choose 7 € {0,1}. 
Then 


xt — Xx 
~ ex)? YP 46 


f(x) (cx)? 2? — at). 

Although this theorem may look too complicated to apply, Rédei gave many 
highly nontrivial, fascinating applications of it to several problems in number the- 
ory, group theory and combinatorics. Lovasz and Schrijver (1981) gave a short 
proof to some of these applications. Amazingly, the basic idea in this proof is just 
the simple fact that every function over a finite field is a polynomial. This enables 
one to derive combinatorial results by manipulating with these polynomials. Using 


this idea, Lovasz and Schrijver give a short proof of the following result of Rédct. 


Theorem 7.3. For a prime p, any set X of p points, not all on a line, in the affine 
plane AG(2, p), determines at least (p +3)/2 directions. (X determines a direction 
if there is a line in this direction containing at least two points of X.) 
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Blokhuis and Seidel (1985) showed that Wielandt’s visibility theorem is an almost 
direct consequence of this result. It also has some applications in group factoring. 
Let G be a finite abelian group, written additively, and suppose Aj, A2,...,Am 
are subsets of G, each containing 0. We say that G has an (Aj,..., A») factoring 
and write G = (A\,A2,..-,Am) if every element of G is uniquely expressible as 
a sum a) +@)+---+@,, Where a; € A;. Using Theorem 7.3, one can show that if 
G ~ Z, ® Zp, where p isa prime and G = (A, B), then either A or B is a subgroup. 
Indeed, G is naturally isomorphic to AG(2, p). If |Aj,|B] > 1 then |A| = |B] = p 
It is not too difficult to check (see Lovasz and Schrijver 1981) that no direction is 
determined by both A and B. Hence either A or B determines at most half of the 
directions, i.e., less than (p + 3)/2 directions. By Theorem 7.3 this set is a line, and 
since it contains 0, it is a subgroup. 

Rédei obtained a far reaching generalization of this result. Using group charac- 
ters, an appropriate factorization of polynomials and the group ring of G over the 
integers, he proved the following. 


Theorem 7.4 (Rédei 1965). If G is a finite abelian group that has an (A),...,Am) 
factoring, where each A; has a prime order, then at least one A; is a subgroup. 


This theorem generalizes Hajos theorem, (cf. Fuchs 1967) which is probably the 
most dramatic work in factoring groups, and which solved a tiling problem raised 
by Minkowski in 1907. 

Theorem 7.3 is a special case of a more general result of Rédei (1973, pp. 225, 
226), which asserts that the number of directions m determined by a set of q points, 
not all on a line, in the affine plane AG(2,q), where q = p" is a prime power, is 
at least 

q-1 
m2 plat rl +1 

Blokhuis and Brouwer (1986) found a nice way to combine this result with the 
Jamison-Brouwer-Schrijver theorem (see Theorem 6.5), and derive a bound for 
the size of non-trivial blocking sets in Desarguesian projective planes. Let PG(2, q) 
denote the projective plane over GF(q). A blocking set in PG(2,q) is a set that 
intersects every line. It is nontrivial if it contains no line. Bruen showed that 
any non-trivial blocking set S in PG(2,q) contains at least q + \/q +1 points, and 
equality holds iff q is a square and S is the set of points of a Baer subplane. He 
also noticed, together with Thas, that there is a connection between Rédei’s results 
and blocking sets. This connection was later applied by Blokhuis and Brouwer to 
prove that if g = p” > 7 is a non-square, odd prime power, qg # 27, then any non- 
trivial blocking set in PG(2,q) contains more than q + ia points. The proof is 
very short; let S be a minimal, nontrivial blocking set, |S| = q +k. If there exists a 
line containing k of these points, then make it the line at infinity. The remaining q 
points now block all the lines of the affine plane except in k directions, and hence 
they determine at most these k directions. By Rédei’s result stated in (7.1), 


> ta 7 tl> v2q, 


(7.1) 
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as needed. Thus, we may assume that no line contains more than (k — 1) points of 
S. Let v be an arbitrary point of S. By the minimality of S, there exists a tangent 
through vu (i.c., a line containing only this point from S). Make this line the line 
at infinity and observe that the remaining points cover all lines of the affine plane 
except the other tangents at this point. By Theorem 6.5, there must be at least 
q — k such other tangents. Thus, there are at least g —k + 1 tangents through any 
point of S. Since there are no k points of § on a line, there are at most q— 1 
tangents through any point not in S. Hence, by counting the incident pairs of the 
form (tangent, point not in S) we conclude that 


(q+k)\(q—k+1)q <(¢—-k+1)(q—-1), 


which gives k > ,/2q, completing the proof. 
For the (odd) prime case q = p, the above estimate has recently been improved 
considerably in an elegant paper of Blokhuis (1994) to 3(p + 1)/2, which is optimal. 


7.3. Hilbert’s basis theorem and Ehrenfeucht’s conjecture in language theory 


For a (finite) alphabet A, let A* denote, as usual, the set of all finite words over 
A. For two alphabets A, B, a function f : A* — B* is a morphism if for every 
x,y EA", f(xy) = f(x) f(y), where xy and f(x)f(y) denote here the concatination 
of x,y and that of f(x), f(y), respectively. 

Let A be a finite alphabet and let & C A‘ be an arbitrary language over A. 
Ehrenfeucht conjectured that there is always a finite set F C £ such that for any 
alphabet B and for any two morphisms g,h : A‘ — B*, g(x) = h(x) for allx € &% 
if and only if g(x) = A(x) for all x € F. We call such an F a test set for £. 

This conjecture has been solved, independently, by Albert and Lawrence, by Mc- 
Naughton and by VS. Guba (cf. Salomaa 1985). All proofs reduce the conjecture 
to Hilbert’s basis theorem, which is the following. 


Theorem 7.5. Every ideal in the polynomial ring Z|x,,...,Xn] is finitely generated. 
Hence any infinite system S of polynomial equations over Z is equivalent to some 
finite subsystem S' of it (i.e, S and S' have the same solutions in the complex field). 


Hilbert’s basis theorem can be proved by a rather simple induction on n (see, 
e.g., Van der Waerden 1931). A special case of it plays an important role in integer 
programming, see chapter 30. 

Ehrenfeucht’s conjecture is reduced to Hilbert’s theorem in two steps, as outlined 
below. 

Step 1. A system of equations W) = W° (ie 1) where W and W"” are words 
in C*, has a solution f if there exists an alphabet D and a morphism f : C* — D* 
such that f(W) = f (Ww) for all i € I. Two systems of word equations are equiv- 
alent if they have the same solutions. It is not too difficult to reduce Ehrenfeucht’s 
conjecture to the following statement about word equations. 


Statement 7.6. Every system of word equations is equivalent to a finite subsystem 
of it. 
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Step 2. Statement 7.6 is reduced to Theorem 7.5 by constructing, for any system 
E of word equations over an alphabet C, a system S of polynomials such that 
every solution of E corresponds to a solution of a certain type of S. (The system 
S might have some other solutions, as well.) 

Since Step 2 is the crucial part of the proof let us briefly describe it. The basic idea 
is the following. If the alphabet D has n letters, then any word in D* corresponds, 
naturally, to the number it represents in base n. If f : C* — D* is a morphism, 
then for every word W € C*, f(W), considered as the number it describes, can 
be expressed as a polynomial in the numbers f(c) for ce C and the numbers 
n'ensth(((O), where length (f(c)) is the number of letters in the word f(c). Therefore, 
by introducing variables for the 2|c| numbers f(c) and n!"28¥©) for ¢ € C, we can 
replace each word equation by two polynomial equations. Being more precise 
now, let us introduce, for each letter c € C, two variables c, and c2. (We will later 
substitute f(c) for cy and nl") for co.) For any word W =c!c?.--ck € Ct 
define 


PAW) =cidd---k+ che kee had 
and , 
Py(W) = O65 ---4 (7.2) 


Also, for the empty word A, P,(A) =0 and P(A) = 1. Given the system E of 
word equations W“) =wW” (i € 1), let S be the system of polynomial equations 
PWM) — pw") and P,(W")) = p(w) (i¢ 7). By construction, for every 
alphabet D of n letters and every morphism f : C* — D*, f is a solution of E if 
and only if c¢; = f(c) and cz = n'*"8*) (¢ € C) is a solution of S. Therefore, the 
existence of a finite subsystem of S equivalent to it, which follows from Theorem 
7.5, supplies the existence of a finite subsystem of E equivalent to E. For more 
details, including the (simple) proof of the equivalence between Ehrenfeucht’s 
conjecture and Statement 7.6, see Salomaa (1985). 

We note that the decision problem: “Given a (recursively enumerable) language 
£ C A* and two morphisms g,h : A* - B’, is g(x) = A(x) for all x € £2?” is un- 
decidable, and thus there is no “constructive” proof of Ehrenfeucht’s conjecture 
(i.e., a proof that actually produces a finite test set for £ from its description). 


8. Hyperbolic geometry and triangulations of polytopes and polygons 


Let P be a 3-dimensional simplicial polytope. Let T(P) denote the minimum 
number of tetrahedra, each being the convex hull of four vertices of P, whose 
union covers P. For n > 4, let T(n) be max 7(P), where the maximum is taken 
over all simplicial polytopes ? with # vertices. 

It is easy to check that for every n > 12, T(n) < 2n — 10. Indeed, a simplicial 
3-polytope P on n vertices has 2n — 4 faces and 3n — 6 edges. If n > 12, there is a 
vertex v of P incident with at least 6 faces. For each other face f of P, let S; be 
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a tetrahedron whose vertices are v and the three vertices of f. These tetrahedra 
cover ?, and their number is at most 2n — 10. 

Sleator et al. (1986) proved that T(n) > 2n — 10 (and hence equals 2” -- 10) for 
infinitely many values of 7. Their interesting proof uses hyperbolic geometry. Here 
is an outline of the idea. If one can construct a polytope P and show. somehow, 
that the volume of each tetrahedron on 4 of its verticcs is at most a fraction 
1/é@ of the volume of P, then the inequality T(P) > @ follows. Unfortunately, the 
largest & for which the previous statement holds is a constant, independent of the 
number of vertices of P. Thus, instead of using the usual Euclidean space R?, we 
embed P in the 3-dimensional hyperbolic space (and observe that any cover of 
P by tetrahedra in R* corresponds to a cover of the same size of P here). In 
this new space, the volume of each tetrahedron is bounded by a constant Cy, and 
thus we need only construct a polytope P whose volume is at least @- Co. For @ = 
2n  O(/n) aconstruction of such a polytope on n vertices is not too difficult. The 
reader is referred to Coxeter (1956) for the fundamentals of hyperbolic geometry. 
The 3-dimensional hyperbolic space can be viewed as an upper half space whose 
boundary is the complex plane, plus a point denoted oo. A geodesic here is a 
semicircle perpendicular to the complex plane, or a line perpendicular to this 
plane, that goes to oo. Any tetrahedron whose base forms an equilateral triangle 
on the complex plane and whose fourth vertex is 00 is a tetrahedron of maximum 
volume. Consider a tessellation of the complex plane by equilateral triangles, and 
let S be a set of 6k? such triangles whose union is hexagonal, with k edges on cach 
side. This hexagon has 3k? + 3k +1 vertices. Let P be the polytope whose vertices 
are oo and these vertices. Since P is the union of 6k? tetrahedra of maximal 
volume, its volume is 6k? - Cy. This shows that T (3k? + 3k +2) > 6k?, and hence 
that T(n) > 2n — O(/n). For the sharper estimate T(n) > 2n — 10, see Sleator et 
al. (1986). 

The problem of covering a polytope by tetrahedra is related to another interest- 
ing combinatorial problem. Let G be a labeled convex polygon with n vertices in 
the plane, and consider a planar triangulation of G with no interior vertices. We 
call the 1 sides of G edges and the chords that divide it into triangles are called 
diagonals. 

A diagonal flip is an operation that transforms one triangulation of G into an- 
other by removing a diagonal, thus creating a face with four sides, and by inserting 
the opposite diagonal of this resulting quadrilateral. The distance d(7, ™) between 
two triangulations 7, and 7 of G is the minimum number of diagonal flips needed 
to transform one into the other. Motivated by a data-structure problem on dynamic 
trees, SIeator et al. (1986) considered the problem of determining or estimating 
d(n) = max d(7,, 7), where 7 and 7 range over all triangulations of a labeled n- 
gon. It is easy to see that d() < 2n — 10 for all n > 12. Somewhat surprisingly, a 
lower bound for d(), showing that d(m) = 2n — 10 for infinitely many values of n, 
can be extracted from the corresponding result for 7(71) — the maximum value of 
the minimum number of tetrahedra needed to cover a convex n-polytope. Here is 
an outline of the idea. 


Let P be a convex simplicial n-polytope whose graph is Hamiltonian such that 
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T(P) is as large as possible. (By the Sleator-Tarjan-Thurston result, there is such P 
with 7(P) = 2n — 10 for infinitely many values of n.) Cut P along the edges of the 
Hamilton cycle to obtain two triangulated parts. Denote these two triangulations by 
7, and 7. We claim that d(7,,7) > 7(P) = 2n — 10 (and hence d(7,, 7) = 2n — 10). 
To see this we show that P can be covered by d(1,7) tetrahedra. Consider a 
sequence of d(7),7) diagonal flips that transform 7 into 7. Imagine a planar 
base with triangulation 7, drawn on it. Suppose the first diagonal flip replaces the 
diagonal (a,c) with the diagonal (b,d). Create a flat quadrilateral with the same 
shape as (a,b,c,d). On its back draw the diagonal (a,c) and on its front draw 
the diagonal (6, d). Now place this quadrilateral onto the base in the appropriate 
place, with the diagonal (a,c) down and (b,d) up. Looking from the top we see 
a picture of the triangulation obtained from +, by making the first diagonal flip. 
For each successive move we create an additional quadrilateral and place it onto 
the base. After placing d(1,,7) such quadrilaterals we will see 7. when we view 
the basc from the top. We can now inflate cach quadrilateral slightly, to make it 
into a tetrahedron. The resulting stack of quadrilaterals forms a covering of P by 
d(7,7) tetrahedra, as needed. For more details and several other related results 
the reader is referred to Sleator et al. (1986). 


9. The Erdés—Moser conjecture and the hard Lefschetz theorem 


For a finite subset S of R, and for k € R, let f(S,k) denote the number of subsets 
of S whose elements sum to &. Erdés and Moser conjectured, in 1965, that for 
every set S of 2n + 1 distinct real numbers, and any k, 


f(S,k) <f({-n,-n+1,...,n},0) . (9.1) 


Similarly, it was conjectured that for every set 7 of distinct positive numbers 
and any k 


F(T, k) < f({1,2,-.-sn}, [alr +1)/4)) - (9.2) 


Both (9.1) and (9.2) follow from the results of Stanley (1980) (see also Stanley 
1983). Surprisingly, Stanley’s results depend on some deep results from algebraic 
geometry and in particular on the hard Lefschetz theorem, stated in chapter 34. 
A somewhat more elementary, similar proof was given later, by Proctor (1982), 
whose proof involved representations of the Lie algebra s@(2,C). However, there 
is no known purcly combinatorial proof, 

To prove (9.2) it is useful to define the following partially ordered set M (z). 
The elements of M(n) are all ordered sets of integers (a,@2,...,a,) where n > 
a) >a, >+-- >a, >i, and (a;,...,a,) > (b),...,b;) if k 27 and a, > b,...,a; > 
b;. Put M(n), = {(a),...,4%) € M(n): va =r} and notice that |M(n),| = 
f({1,2,...,n},r). Define also N = (5!) An easy lemma, first observed by Lind- 
strém, states that if M(v)j~/2 is the biggest antichain of M(m), then (9.2) holds. 
Stanley proved that M (7)i//2) is the biggest antichain of M (1) by showing that for 
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every 0 <i < [N/2] there exist M (n); pairwise disjoint chains x; < Xj4) <---> <xXy_j 
in M(n), where x; € M(n);. The proof uses the linear algebra method, whose many 
applications in combinatorics are described in chapter 31. However, the construc- 
tion of the necessary linear mappings is highly nontrivial. We construct linear trans- 
formations g; : V; > Vj, for0 <i < N, where V; is the complex vector space with 
basis M(n);, such that for 0 <i < [N/2], gy 5-19 9N 4.29°°°9G 2 Vi Vy ; is 
invertible and for x € M(n); and g(x) = So {cy-y:y € M(n)isi}, cy #0 implies 
y >x. This, in turn, supplies the existence of the desired pairwise disjoint chains 
in M(n). 

The existence of these mappings is established using the hard Lefschetz theorem, 
stated in chapter 34. For more details and more gencral results see Stanley (1980). 
Several other fascinating combinatorial applications of the hard Lefschetz theorem 
appear in Stanley (1983) and some of its references. 
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1. Elementary methods 


We examine a methodology for proving the existence of combinatorial configura- 
tions having certain desired properties. In its basic form a probability space is 
created whose elements are configurations. It is shown that the probability that a 
configuration does not have the desired property is less than unity — generally by 
showing that it is extremely small? With positive probability the configuration is 
good. Hence, some good configuration. exists. The methodology is best described 
by example. We often give suboptimal results in order to better illustrate the 
methodology, only making reference to the best-known results. 

My joint books with Paul Erdés (Erdés and Spencer 1974) and Noga Alon 
(Alon and Spencer 1992) and my previous survey paper (Spencer 1978) and 
monograph (Spencer 1994) deal also with these topics in some detail. 

We begin with a simple probability fact which we shall call the counting sieve: 


If > Pr[A,J<1, then A A,#@. 


Here the summation and conjunction range over the same index set. It is 
surprising what results may be obtained from this simple principle. 

The Ramsey function R(k, t) is the minimal such that if the edges of K,, are 
colored Red and Blue then there exists cither a Red K, or a Blue K, (see chapter 
25). The lower bound R(k, t)>n thus means that there exists a two-coloring of 
K, such that neither type of monochromatic subgraph exists. Erd6és (1947) 
inaugurated the modern use of the probabilistic method with the following lower 
bound on the diagonal function R(n, n). 


Theorem 1.1. /f (7)2'-® <1, then R(k,k)>n. 


Proof. Define a probability space on the set of two-colorings of K,, by assuming 
that each edge is colored Red with probability 4 (otherwise Blue) and that these 
probabilities are mutually independent. (We may consider this a Gedanken 
experiment in which a fair coin is thrown to determine the color of each edge.) 
Call a coloring Bad if some k-set is monochromatic, otherwise Good. For each set 
S of k vertices let A, be the event that S is monochromatic. As S has (4) edges 
and the corresponding coin flips must all be the same, Pr[A,] =2' . There are 
(Z) such S, so our hypothesis and the counting sieve imply that the event A A, is 
nonempty. The event AA, means precisely that the coloring is Good. Hence 
there is a Good coloring. Hence R(k,k)>n. O 


Nearly all applications of the probabilistic method deal with the asymptotic 
behavior of combinatorial functions. It is essential to have a good feel for 
asymptotic calculations (see chapter 22). For example, consider finding that for 
which (7)2'~® is roughly unity. A “seat of the pants” calculation would be to 
estimate (7) by n* and (4) ~1 by k?/2, so that the critical n is roughly when 
nkq 2 1, n=2*’*. This is the first term for the lower bound for R(k, k). The 
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upper bound (which is not probabilistic) for R(k, k) is roughly 4“. A major, and 
thus far intractable, problem is to find that c for which R(k, k) is roughly c*. 
A more precise calculation using Stirling’s Formula gives 


R(k, k) > [kiev3}2°(1 + o(1)). 


A calculation with k= 100 shows R(100, 100) >3x 10'*. A strong threshold 
behavior is observed if we decrease n slightly, say n =2.5 x 10'*. A coloring is 
then Bad with probability less than (,%))2'~*°"<10°". A coloring created 
randomly on 2.5 x 10"° vertices will almost certainly be Good. 

The propertics of random configurations make for fascinating study in their 
own right, involving a challenging mix of probabilistic and combinatorial tech- 
niques. These are described in chapter 6. While our methodologies overlap to a 
considerable degree in this survey attention is restricted to use of probabilistic 
methods in order to prove the existence of certain configurations. 


Warning. The counting sieve does not work in reverse. Generally other methods 
must be used to prove the nonexistence of a configuration. In particular, even the 
existence of an upper bound on R(k,k)—i.e., a proof of Ramsey’s Theorem — 
does not seem to follow from probabilistic considerations. 


Theorem 1.2. Jf there exists p, 0<p <1, such that 


aa on 


then R(k,t)>n. 


Proof. We adjust the probability space used in the proof of Theorem 1.1. The 
space is again the set of two-colorings of K, but now each edge is colored Red 
with probability p. For each k-set § let A, be the event that all edges of S are 
colored Red, and for each t-set T let B, be the event that all edges of T are 
colored Blue. Our assumption and the counting sieve imply the existence of a 
two-coloring for which no A, and no B, hold. Hence R(k,t)>n. O 

As an example, when k=4 we may take p=n >”? and n=f?**. The 
hypothesis of Theorem 1.2 is then satisfied so that R(4, t)>0°/?*°, Asymptotic 
bounds on R(K, 4) for fixed k have been examined by the present author (Spencer 
1977). We will return to the special case k =3 in section 2. 

A tournament on n players consists of the results of (3) matches, one between 
every pair of players, in which there are no draws. A tournament is said to have 
property S, if for every set of k players there is some other player who beats them 
all. Do there exist for arbitrarily large A tournaments with property S,? An 
affirmative answer was given by Erdés (1963a). Consider a random tournament 
on 7 players in which the outcome of each match is decided by the toss of a fair 
coin. For each set P of k players let A, be the event that no player outside of P 
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beats all players in P. Any y & P has probability 2 * of beating all players in P. 
These events are mutually independent over y as they involve different games. 
Hence Pr[A,] =(1—2°*)” *. By the counting sieve: if (7)(1—2-*)""* <1, then 
A A, #9. But this i is precisely the event that the tournament has property S,. For 
k fixed (7)(1 ~2>*)"* approaches zero in n so that for n sufficiently large there 
exist tournaments with property 5,. 

A family ¥ = {S,,..., S,} of sets is 2-colorable (see chapter 7) is there exists a 
two-coloring of the underlying points so that no set is monochromatic. Assume 
each set S; has precisely n elements. Erdés (1963b) proved that if the number of 
sets r<2””', then & is 2-colorable. To show this, consider a random coloring of 
the underlying points. Let A; be the event that S, is monochromatic so that 
Pr[A,] =2' ”. With r<2" | the counting sieve gives A A, #§. This is precisely 
the event that no set is monochromatic. 

Erdos (1964) defined m(7) as the minimal size of a family of n-sets which is not 
2-colorable. [In chapter 7 this is denoted by #,(7).] It is somewhat surprising, and 
unusual, that an upper bound to this combinatorial function can also be given by 
probabilistic means. Essentially, the sets now become random and cach coloring 
gives an event. Let S,,...,.5, be randomly sclected n-sets from {1,...,v}. 
(That is, each S, is selected independently and uniformly from among, the (%) 
n-sets.) Let x be a two-coloring of {1,...,u}. Assume first that y has v/2 Red 
and v/2 Blue points. For such a given x any particular S, will be monochromatic 
with probability 2(¢/?)/(%), as it is monochromatic if and only if it is either a 
subset of the Red points or a subset of the Blue points. Since the S, are sclected 
independently, the probability that no S$, is monochromatic is [1 — 2(°/?7)/(2)]’. 
Let A, be the event that under coloring x no S, is monochromatic. For any x, 


miare[i-2(°2)/(0) 


as it is easy to see that Pr[A,] is maximized when y is balanced. There are 2° 
colorings. The event AA,, conjunction over the 2” colorings, is precisely that 
{S,,...,5,} is not 2-colorable. Thus A A, #9 if and only if there is a family ¥ 
of r n- -sets on v points which is not 2- colorable. Applying the counting sieve: if, 
for any v, 2°[1 — 2(°/2)/(2)]’ <1, then m(n) <r. 

We now adjust uv so as to give the best upper bound r on m(n). This problem is 
fairly typical of those encountered when analyzing the asymptotics of the 
probabilistic method. While our language is informal, the argument may be made 
rigorous. For small ¢ we may estimate | - € by e °. With (",7)/(2) small we may 


r= /[at?)/(0)] 


When v is large, 2(°/?)/(%) is roughly 2'~”. (Given v/2 Red and v/2 Blue balls, 
2(°/2)/(“) is the probability that a random n-set is monochromatic, whereas 2'~" 
is the probability that » points independently selected would have the same color. 
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In probability language, the distinction is between sampling without replacement 
(the n-set) and sampling with replacement (m points), and for large v the 
distinction is negligible.) This approximation would yield r~v2"~'(in2). As uv 
gets smaller, however, the approximation becomes less accurate and, as we wish 
to minimize r, the tradeoff becomes essential. We use a second-order approxi- 


mation 
7h A ee =" i (vu — 2i)/(w - i) 


—_ 3! ne ene 
(estimating (v ~ 2i)/(v — i) ~ 1 ~ i/v ~e””) go that r~ 2" "(In 2) e” ”. Elemen- 
tary calculus gives v = n7/2 for the optimal value. We combine both bounds with 
the expression 


2"! <m(n) < (1 + o(1))fe(in 2)/4]n72" . 


Written in the above form the asymptotic behavior of m(n) appears essentially 
solved. This is unfortunate for, as we shall soon sec, when viewed from a 
probabilistic standpoint the gap is indeed wide. 

Our second basic weapon is called linearity of expectation. It is, quite simply, 
that if X,,...,X, are random variables, then 


E[X, +-+° + X,]= EX] +--+ EX]. 


The strength of this property lies in the fact that it remains valid even if the X, are 
dependent on each other. Consider the hat-check girl problem: if n hats are 
distributed at random to n men, how many get their own hat? Let X, be the 
indicator random variable for the event that the ith man gets his own hat back 
(i.e., X;=1 if he does, 0 if he does not) and set X = X, +--+ + X,, so that X is 
the number of men getting their own hat back. The expectation of an indicator 
random variable is simply the probability of the associated event, so E[X,] = 1/n. 
By linearity of expectation, E[X]= 1. On the average, one man will get his own 
hat back. Of course, this argument tells us nothing about the distribution of X 
(which in this case is roughly Poisson with mean 1) but often the expectation 
alone is sufficient for our needs. 

Given a round-robin tournament (as defined earlier) on n players, a Hamilto- 
nian path is defined as a permutation P(1),..., P() of the players so that for 
1<i<n—1 P(i) beats P(i+1). Szele (1943), in arguably the first use of 
probabilistic methods, asked for the maximal number of Hamiltonian paths a 
tournament could have. Let X be the number of Hamiltonian paths in a random 
tournament. Let X, be the indicator random variable of the event that permuta- 
tion P generates a Hamiltonian path. Then X = )) X,, the summation over all n! 
permutations P. For any given P, E[X,]=2°", since that is the probability 
that the n—1 games P(i) versus P(i+1) all have the desired outcome. By 
linearity of expectation, E[X]=n!2-“~". A random variable cannot be always 
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less than its expectation. There must be some point in the probability space with 
x2n!2“~"). That is, there is a tournament with at least n!2~“~ Hamiltonian 
paths. 

Linearity of expectation is often used for indicator random variables of events 
of equal probability. Let.n events each of probability p be given and let _X be the 
number of these events that occur. Then E[X’] = mp. Thus there is some point in 
the probability space for which at most (and, similarly, at least) np of the events 
occur. When np <1 we may consider the counting sieve as a corollary. We 
emphasize again that the n events necd not be independent for these deductions. 

Let us return to 2-colorability. Fix a family ¥ = {S,,...,S,} of sets, each of 
cardinality n, and color the underlying points randomly. Let X, be the indicator 
random variable for S$, being monochromatic. Let X = X,+---+X,, ie., the 
number of monochromatic sets. By linearity of expectation E[X] = r2'"". When 
r<2" ', reformulating Erdés’ argument, E[X]< 1 and hence X =0 is not a null 
event, The inverse docs not hold. When r= 2” ', E|X| = 1, but this, by itself, stilt 
allows the possibility that XY = 0 sometimes — if, say, E[X] = 100 and X had “‘a lot 
of distribution” one would naturally expect X = 0 to occur with positive probabili- 
ty. We may rephrase determination of :(n) as finding the maximal w so that, in 
this particular instance, ELX]<w implies Pr[X =0|>0. Then 1<w<cn’, the 
gap is indeed wide. In section 2 we give a more advanced technique due to Beck 
(1978) which will show w > cn'”? 

The following result plays a central role in extremal set theory (see chapter 24). 


Theorem 1.3 (Yamamoto’s inequality). Suppose ¥ C2'"! is an antichain. Then 


Poy /(n)< 1. 


Proof. Select a permutation o uniformly from S, and let @, be the chain it 
generates. That is, 


€, ={{o(j): 1sj si}: OSI <n}. 
Let X, be the indicator random variable for the event F € ¥ and set 


gap a ome 


FEF 


The chain ©, contains exactly one set of size |F|, uniformly distributed among all 
such sets. Hence 


E[X,] = PFE 6] =1 / a 


and by linearity of expectation 


E[X|= 2 ale 
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If Yamamoto’s inequality fails, then E[.X]> 1 so that there exists a specific o for 
which X > 2, i.e. |€@, A ¥| 22. But this would imply that ¥ is not an antichain. O 


Certainty Yamamoto’s inequality may be proven without reference to prob- 
abilistic ideas. Indeed, probabilistic results may always be phrased as counting 
arguments. Oftentimes (including, arguably, the above example) the probabilistic 
framework provides key insight into the core of the result. 

Our third basic weapon is the bounding of large deviations. We require results 
of a purely probabilistic nature which, for convenience, are gathered in the 
Appendix, which is self-contained. 

Let # be a family of sets with underlying point set Q. A two-coloring will refer 
to a map y: 2—{-1, +1}. For any SEF we set 


x(S) = xi), 


so that, for example, y(S)=0 means S is colored equally (+1) and (—1). The 
discrepancy of x is defined by 


disc(y, #) =max |x(S)| 
sey 
and the discrepancy of 4%, denoted by A(#), is given by 
A(F) = min disc(y, F) , 


the minimum over all two-colorings y. Thus A(¥) < K means that it is possible to 
two-color 2 so that all SEF are “balanced” within K. See chapter 27 for a 
general discussion of discrepancy. 

Now suppose ¥ consists of n sets on an underlying set 2 of n points. We use 
large deviations to bound A(#). Consider a random two-coloring x and for each 
SEF let Ay be the event that |x(S)| >a. When S has m elements, y(S) has 
distribution S,,, so, by Corollary A.2, 


Pr[A,]<2e°7 2" <2e7 . 


When 2ne""/"" <1, the counting sieve implies A A, #9, there is a y so that all 
\x(S)|<@ and so A(¥) <a. Solving for a, 


A(¥) = V2n In(2n) . 


This result may be “reversed”, analogously to the m(n) bounds, to show the 
existence of an ¥ with nm sets on n elements with A(#)>cVn, c an absolute 
constant. An improvement on the upper bound is given in section 2. 

For the. next example, we return again to tournaments. Given a (round-robin) 
tournament T on 7 players and an ordering P(1),..., P(#) of the players, the fit 
f(T, P) is the number of pairs of players i, j, with i<j and P(i)< P(/). That is, 
the fit is the number of games for which the ordering is ‘correct’. Set m = (4) for 
convenience. If T is transitive (has a natural order), then f(T, P) =m for that 


Probabilistic methods : 1793 


ordering. Let g(T) equal the maximum of f(T, P) over all P, i.e., the best fit. For 
any T we may try an arbitrary P and its reverse. One of them will agree with T in 
at least half the games. Hence g(T) = m/2. Erd6és and Moon (1965) showed that 
this is nearly best possible. Consider a random tournament T. For each P, f(T, P) 
has binomial distribution B(m, 4) since each game has independent probability 4 
to agree with the ordering P. Thus 2f(T, P) — m has distribution S,,. Let Ap be 
the event that f(T, P) =m/2+ a. Then, by Corollary A.2, 


Pr{A,] = Pr[S,, = 2a} < e247 12m 


We desire A A, #9, but we have an “enormous” number, n!, of P. Fortunately, 
the tail of the distribution S,, decreases very quickly. We set a = n°/*(Inn)''?/2 so 
that Pr[A,]<n "<(n!) |. (This use of what normally would be considered the 
extreme tail of the distribution is quite common.) There exists T on n players with 


1 
X(T) <5 () +n??(inn)"/2/2. 


For the mathematician, even combinatorialist, not familiar with probabilistic 
methods a much weaker method can be quite impressive: there exist tournaments 
which cannot be ranked so that more than 51% of the games are in order. 
Construction of specific tournaments with this property appears to be quite 
difficult and quite possibly, in this author’s opinion, impossible. (This leads, 
however, into a logical thicket around the terms “construction” and “specific” in 
which we shall not become embroiled!) Let us set h(n) equal to the minimum of 
g(T) over all tournaments T on 2 players so that we have shown 


1 [n 1 fn 1 
3 (5) <h(n) <5 ) +5 nv? (inn). 


From the probabilistic standpoint the 4(3) term may be regarded as the “zero 
term” and so this inequality could certainly stand improvement, as indeed it will 
get in section 4. 


2. Advanced methods 


In section | we created probability spaces and showed the existence of a good 
point (configuration, coloring, tournament) by showing that the bad points have 
measure less than one. By moving slightly away from the threshold value we find 
that the bad points are negligible in measure so that a randomly selected point 
will almost surely be good. We consider a method ‘tadvanced”’ if it enables us to 
find “rare” points, i-¢., if it works even when the set of good points is very small 
in measure. An advanced method allows us to find (or at least to prove the 
existence of) a needle in a haystack, an elementary method shows us the hay. As 
with customary mathematical usage, advanced methods can be quite simple and 
elementary methods most ingenious. For the latter consider de la Vega’s 
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improvement, given in section 4, of the ranking of tournaments function A(n). For 
the former ... read on. 

The deletion method takes a random configuration which is not good but is bad 
in only a few places. An alteration, generally a deletion, is made at each bad 
place until the object, now gencrally smaller, has the desired property. Consider a 
random two-coloring of the edges of K, and let X denote the number of 
monochromatic k-sets. Then E[X] = (2) ® by linearity of expectation. To 
bound the Ramsey function R(k, k) by elementary methods we took n ‘such that 
E[X]<1. Now take n larger so that EL[X]=w> 1. There is a coloring with at 
most w monochromatic k-sets. Select one vertex arbitrarily from cach such k-sct 
and delete it from the vertex set. What remains is at least m — w vertices with no 
monochromatic k-set and hence R(k,k)=n—w. The value n~(1/eV2)k2*”? 
gave w ~ 1. We may increase n by a factor of nearly V2 and still have w <n. The 
monochromatic k-sets are then small in number and their deletion still leaves 
roughly n vertices. Thus 


R(k, k) >= KC +o(1)). 


Let G be a graph with n vertices and nt/2 edges, t= 1. Let a(G) denote the 
maximal size of an independent set. Turan’s Theorem (see chapter 22) gives 
a(G)2n/(t+ 1). With the deletion method we shall get a sct half that size. Fix 
the graph G. (Note that G itself is not random.) Let p be, for the moment, 
arbitrary and define a random sct S of vertices by placing each vertex of G in § 
with independent probability p. Let X be the size of S and let Y be the number of 
edges of G with both vertices in S. Clearly E[X]=np. Since each of the nt/2 
edges of G has probability p? of having both vertices in S, linearity of expectation 
gives E[Y] = (nt/2)p? and hence 


E[X — Y]=np — (nt/2)p’ . 


Now set p = 1/t to maximize this quantity: ELX — Y] = n/2t. There is a “point” in 
the probability space —a specific set S-—for which X — Y2n/2t. Delete one 
vertex arbitrarily from cach edge of S. This leaves a set S* which is independent 
and has at least X — Y vertices. Thus a(G) = 7/2z. 

Let X be a random variable with mean y and variance a”. The basic inequality 
of Chebyshev states that for any positive A 


Pr[|X — p| >A] <1/A*. 
Taking A= ep/o, 
Pri|X — w|> ep] <o7/e’p’. 


In many examples X depends on a parameter approaching infinity. If we know 
that Var[X] = 0(E[X]’), then for any fixed € the above probability is o(1). This is 
the second-moment method: if Var[X] = o(E[X]’), then X is almost always almost 
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equal to its expectation. (This method is particularly valuable in studying random 
graphs, see chapter 6.) 

As an example, let G be a regular graph of degree ¢ on n vertices with 1<t<n. 
Set p=1/t and, as before, define a random set S of vertices by placing each 
vertex of G in § with independent probability p. Again, let X be the size of S and 
Y be the number of edges of G with both vertices in S. X has mean np and 
variance np(1—p) so by the second-moment method X ~n/t almost always. 
(However, in this case the bounds of the Appendix — or standard bounds on the 
binomial distribution — give much stronger results.) Let 1 <i <m = nt/2 index the 
edges of G and let Y, be the indicator random variable for the ith edge having 
both vertices in S. Then E[Y,] = p’ and Y = Y, + --- + Y,,. We employ the gencral 
formula 


i,j=t 


Var[Y] = » Cov[Y;, Yj], 


where Cov[Y,, Y,] = E[Y,¥,] -— ELY,JELY,] is the covariance. Critically, when two 
random variables are independent their covariance is zero. This is the case when 
the ith and jth edges have no vertex in common. Each edge has a vertex in 
common with 2-1 edges (including itself). In those cases we use the weak 
bound 


VarLY;, ¥,}< ELY;¥]<ELY,] =p’. 


Thus Var{¥] <m(2¢— 1)p?. But ELY]’ =m’p? > m(2t—1)p’ so the second-mo- 
ment method applies and Y~ E[Y] almost always. Combining these results, S 
almost always has asymptotically m/f vertices and n/2t edges. Therefore, there 
will exist specific § with asymptotically 1/t vertices and n/2t edges. 

Let t<.k<n. Let ¥ be a family of k-scts with underlying point set [n]. F is 
called a t—(n,k, 1) design if every t-set T is contained in a unique K € ¥. The 
existence of designs is a major combinatorial question (see chapter 14) to which 
probabilistic methods have not yet been able to contribute. Let M(n, k, tf) denote 
the minimal cardinality of a family ¥ that covers all t-sets, i.c., that every t-set is 
contained in at least one K & ¥. A simple counting argument shows M(n, k, 1) = 
(7)/(*) with equality if and only if a t—(n,k, 1) design exists. Let 1, k be fixed 
and 7 approach infinity. We find asymptotic upper bounds on M(n, k, t). 

Let # be a random collection where cach k-set K is placed in ¥ with 
independent probability p. It is convenient to parametrize p = [x(")/(*)]/(4) = x/ 

z=!) so that E[|¥|] = p(z) = x(7)/(*). For each t-set T let A; denote the event 
that % does not cover T. As T is contained in (7~‘) k-sets, 


PrA,]=(1- py? ~e*. 


To use elementary methods we set x = (1+ &)In[(7)], so that Pr[A,] <1/(7). 
Then AA, almost always, i.e., ¥ almost always covers all t-sets. By the 
second-moment method |¥|~(%)p almost always. Therefore, there will be a 
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specific ¥ with |¥|~(7)p which covers all sets. As this holds for any «> 0, 


men ten<cror[()/C) af) 


The deletion method (though here it involves enlarging a configuration) allows 
a substantial improvement. With random ¥ as previously defined, let Z denote 
the number of t-sets T not covered. As Pr[A,;]~e “, linearity of expectation 
gives E[Z]~(")e *. For any positive random variable Z and any positive 6 


Pr{Z = (1+ d5)E[Z]]}<1/(1 +8). 


This general principle can be quite useful. Roughly, Z cannot almost always be 
asymptotically greater than its expectation. We know |¥|~ (7)p =x(")/(*) 
almost always. Thus with positive probability |¥|~x(")/(*) and fewer than 
(1+6)(7)e * sets are not covered by ¥. Fix such an ¥. For cach t-sct T not 
covered add to ¥ an arbitrary k-set K containing 7. The new family #* now 
covers all the t-sets (we have ‘“‘corrected the errors’’) and has size at most roughly 
(7)[x/(k) + e “|. Elementary calculus gives the value x = In[(‘)] to optimize this 
quantity. Thus 


met. -[ (C8 -rm[ (2) +0). 


The tale is not over. Erdos and Hanani (1963) conjectured that for k,¢ fixed 


lim mon, k (1) /(") =, 


This would mean that asymptotically one could find families A which were 
“nearly” designs. It was twenty years before Rédl (1985) found a proof of this 
conjecture. The proof technique, called the Rddl nibble, is most ingenious. We 
describe it in a more general context. Let k be fixed and let G be a k-graph on n 
vertices, each vertex having degree asymptotically D. Here, D = D(n) approaches 
infinity. Suppose further that every two vertices have only o(D) edges in 
common. Then Frank! and Roédl (1985) show that there is a set of ~n/k edges 
covering all the vertices. 

Let « >0 be fixed. Choose edges independently with probability p = «/D so 
that ~ne/k edges are selected. Each vertex lies in ~D edges and so has 
probability ~(1 ~ p)” ~e~* of not being covered. A second-moment argument, 
using that two vertices have o(D) edges in common, gives that ~n e”* vertices are 
not covered almost always. Let G’ be the restriction of G to the uncovered 
vertices. One can also show (and this is the hardest part) that G’ has appropriate 
asymptotic regularity. This allows the nibble to be iterated and the iteration is 
continued until fewer than en vertices are uncovered. These are covered one by 
one as with the deletion method. 

The Rédl nibble uses ~ne/k edges to cover ~n(1—e *) vertices. For & very 
small 1—e “~ , so the covering is very efficient. The final one-by-one covering 
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is highly inefficient (we want each edge to cover k new vertices and here it covers 
only one) but involves only en edges. The total number of edges used may be 
written in the form (/r)f(e). While f(e) > 1 for any fixed e, lim,_,, f(e) =1 and 
hence ~n/t edges suffice to cover all the. vertices. 

The recoloration method is similar to the deletion method. One begins with a 
random coloring which yiclds a few bad spots. At these spots a random 
recoloration is made. The recoloration must be strong enough to destroy all bad 
spots yet not so strong as to create new bad Spots: We give an improvement duc to 
Beck (1978) of the lower bound m(n)>2"~' discussed in section 1. 

Let F be a collection on n-scts of cardinality 2” 't. A random coloring yields an 
expected number ¢ of monochromatic sets. We recolor by taking every point x 
lying in at least one monochromatic set and switching its color with probability p. 
These switches are independent over the x. Note that it is immaterial whether x 
lies in one or fifty monochromatic sets. Given that S was, say, Red it remains Red 
with probability (1 — p)"~e ?". We sct p=(1 + &)(Int)/n so that the expected 
number of such S$ is roughly (2” '1)(2' "ye ’"=t°° <1. Almost always the 
recoloration fixes all bad spots. Does it create any new ones? One case is simple: 
almost always no sct is monochromatic in the first coloring and is switched to the 
other color in the recoloring. 

What about S that were not monochromatic but became, say, Red in the 
recoloring. For $,7 € F with SOT ## andV CS —T,V #S—T, let Asz, be the 
event that T was Blue in the first coloring, S was Red except for precisely 
(SAT)UYV, and S became completely Red in the recoloring. If § became Red 
then some A,,, must have occurred. We restrict our attention to the case 
|S A T| = 1, the other cases are similar and give lower probabilities. Let V have 
cardinality v. 

For event A,,, to occur, the original coloring of SUT must be precisely 
correct and a given v + 1 points must have their colors switched in the recoloring. 
Thus Pr[As7y] <2 *"''p’''. (This may be a gross overestimate as in addition all 
points of V must be switchable — must lie in sets monochromatic in the first 
coloring. It appears difficult to utilize this condition.) There are ("7 !)<n"/v! 
choices of V with v elements. As vu increases the larger number of choices for V is 
balanced by the smaller probability that all points of V are switched in the 
recoloring. Let As, be the disjunction of A,;, over all V. Then 


Pr{As7] <2> 2n4+1 S (n’/u!)p’*! 


v=0 


<2°"*!p 2 (np)'/v! 
v=0 
= pa, e”? on Qe to) py . 


(When [SM T|>1 an even smaller bound is found.) There are at most (2” 't)? 
pairs S, T so 


Pr[V As7] <2" “2 27 2n OO sp = Pr sy 
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As long as t<n'’*°”? this quantity is o(1) and so the recoloring is almost always 
free of monochromatic sets. Hence m(n)>n'?~°'2". With more attention to 
detail m(n) > cn'°2” has been shown. 

The counting sicve is in one sense best possible. When events A, are pairwise 
disjoint the condition )) Pr[A,]<1 cannot be weakened. At the opposite 
extreme, if the events A, are mutually independent, then we only need that all 
Pr[A,] <1 to assure A A, #§. The Lovasz sieve is employed when there is “much 
independence” among the A,. Let A,,...,A,, be events. A graph G on the 
index set {I,..., a} is said to be a dependency graph if, for all i, A, is mutually 
independent of {A,: {i, j} € E(G)}. We emphasize that A; must not only be 
independent of each such A, individually but also must be independent of any 
Boolean combination of the A;. The dependency graph is not uniquely de- 
termined by the events. In application, however, there will usually be a natural 
choice of a dependency graph. 


Lovasz Local Lemma 2.1 (Erd6és and Lovasz 1975). Let G be a dependency graph 
on events A,,...,A,. Assume each point of G_has degree at most d. Assume 
Pr[A,] <p for each i. Assume 4dp <1. Then AA, #9. 


Proof. We prove by induction on m that for any m events (calling them 
A,,...,A,, only for convenience) 


Pr[A,|A,---A,,]<1/2d . 


For m = 1 this is obvious. Let {2,..., q} be the points of {2,...,m} adjacent to 
1 in the dependency graph G. Then 


Pr[A,A,---A,|A 
Pr[A,---A,|A, 


7 | 


Pr[A,|A,---A,,] = A ] 


qth’ 
Ps 

We bound the numerator 
Pr[A,A,--- A {Ayy.77 And) SPA | Aga 
=PrfA,]<1/4d, 


-A,,] 


since A, is mutually independent of {A,,,,--.,A,,}-. The denominator is 
bounded by 


q 
Pr{A,-:-A,|A,,,°°°A,,)21- PA, | Aj. An) 
i=2 
21-—(q-—1)(1/2d) (induction) , 
which is at least $ since g —1<d. Thus 


Pr[A, | A,---A,,} < (1/4d)/(1/2) = 1/2d , 
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completing the induction. Finally, 
n 
Pr[4,---A,]=]] Pr{[A,|A,---A,_,]2C0-1/2¢2)">0. 
if 


A striking feature of the Lovasz sieve is the lack of condition on the total 
number n of events. When n is large, Pr[ A A,] can be very small and the Lovasz 
local Lemma sieves out a needle from the haystack. 

Consider the lower bound to the Ramsey function R(k,k). We two-color K,, 
randomly. For each set $ of k vertices let A, be the event that S is mono- 
chromatic. We define the discrepancy graph G by making S, T adjacent if and 
only if S and T have an edge in common, i.e., |S M T| =2. The joint veracity of 
the events A, with T not adjacent to S cannot affect Pr[A,] since S has different 
edges and all edges are colored independently. We bound d, the number of T with 
ISO T|22, by d<(4)(,”,). The Lovasz sieve gives that if 4(4)(,%2) <1, then 
R(k, k) >. Some calculation shows 


R(k, Kyo 428% + o(1)). 


This improves elementary methods by a factor of two. It is the best lower bound 
for R(k, k) currently known. With the upper bound roughly 4‘, the improvement, 
sadly, does not really help in finding the true order of this basic combinatorial 
function. 

The Lovasz sieve is most striking when the degree d is fixed and the number n 
of events can be arbitrarily large. 


Theorem 2.2. Let k, m be positive integers with 4km(m — 1)(1 — 1/k)" <1. Let S 
be any subset of BR, the real numbers, with m elements. Then there exists-a 
k-coloring x :R-> {1,...,k} such that for any tER the translate S + t is colored 
with all k colors. 


Gallai’s Theorem, a generalization of van der Waerden’s Theorem, gives that 
for any finite SCR and any k-coloring y:R- {1,...,k} there exists S$’ 
homothetic to S (i.e., S' = aS +t) which is monochromatic. When homothety is 
replaced by translation the above theorem gives a powerful result in the opposite 
direction (see chapter 25). 


Proof. We first show that any finite D CR may be k-colored so that all translates 
S +t entirely contained in D have all k colors. Color D randomly, i.e., for each 
x€D select x(x) by flipping a fair k-faced die. For each ¢ such that $+1C D let 
A, denote the event that § + ¢ does not have all & colors. Clearly, Pr[A,] <k(1 — 
1/k)”. We define a dependency graph G, letting ¢, 1’ be adjacent if and only if 
(StAN(S +t’) 4G, ie., t =t+s' — 5s" for some s’,s” in S. G has degree at most 
d= m(m — 1). The Lovasz Local Lemma gives A A, #9, so there is a coloring x 
of D as desired. ; 
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Extending colorings of arbitrary finite D to a coloring of all of R requires a new 
concept: the compactness principle. Quite generally, let 2 be an infinite set and 
suppose % is a family of pairs (B, y), BC, B finite, y a k-coloring of B, i.e., 
x: B> (i, »k}. Suppose %& ts closed under restriction ~ if (B, yeu and 
CCB, then (C. x\c) € U. Suppose also that for all finite BC Q there is a y with 
(B,x)€%. The compactness principle is that there then exists a 
x: 2 {1,...,k} such that for all BC 2 (B, y|,) EU (see chapter 42). In our 
case let & be the family of (D, vy) with all §+1CD having all k colors. The 
compactness principle gives a k-coloring y of R. For any (S +1, yl,,,)E%U, so 
that S$ +¢ has all k colors, completing the proof. 


When events A, are not symmetric a more general form of the Lovasz sieve is 
appropriate. 


Lovasz Local Lemma 2.3 (general form). Let A,,...,A, be events with a 
dependency graph G. Suppose there exist x,,...,X,, O<x, < 1, so that for alli 


Pr{A,| <x, Ta- x;) [product over (i, j} EE(G)]. 


Then AA, #9. 


Proof. The induction hypothesis of the carlier ee is replaced by Pr[A, |A,-- 
A, nl SX, and the denominator Pr{A,---A,|A,,,°°°A,,] is set equal to 


a Pr[A,| A,,,°:-A,,], which is bounded by the induction hypothesis. We omit 
the full proof. 2 


Setting y, =x,/Pr[A,] the condition of Lemma 2.3 becomes 


In y,;> Dee Ae In(1 — y, Pr[A,]}) . 


(A /JEG 


When all y, Pr[A,] <1 this may be simplified, approximately, to 


In y, > > y; Pr[A,]. 


LEPJEG 


Consider the lower bound on the Ramsey function R(3, k). Color K, by letting 
each edge be Red (the first color) with probability p, otherwise Blue. For each 
3-set S let A, be the event that S is Red, for each k-set T let B, be the event that 
T is Blue. A dependency graph G is given by joining two indices if they intersect 
in at least two vertices. There are (2) <(ne/k)* k-sets T. Each 3-set S is adjacent 
to less than 3n other 3-sets [a critical improvement over (3)] and each k-set T is 
adjacent to less than Lk? n 3-sets S. Associate to each A, the same y, = y and to 


ce oe a Sn at ryan nen Attra RE hen Eth ee 
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each B, the same y; =z. If, approximately, there exist p, y,z with 


p<i, y Pr[A,} <1, z Pr[B,]<1, 

In y >y Prf[A,](3n) + z Pr[B,[(n elk)’, 

Inz>y Pr[A,]k°n/2 + z Pr[B,](n e/k)* , 
then R(3, k) >n. Erdés (1961), in one of the early masterpieces of the probabilis- 
tic method, proved R(3, k) > ck7/(In ky by a delicate use of the deletion method. 
With the Lovasz sieve this same result falls out with some prosaic calculation — set 
p=ce.n'?, k=c,n'? Inn, z=exp[cjn'(inn)’] and y=1+c, for appropriate 
values of the constants. 

Our final method involves the use of entropy. Let a random variable Y assume 


m possible values with probabilities p,,..., p,,. The entropy of Y, denoted by 
HY], is given by 


H[Y|= > ~ p log, p; - 


Entropy measures the information content of Y. If m =2' and all p,=2‘, then 
H[Y] =. Convexity arguments give that if the p, are smaller the entropy must be 
larger. In contrapositive form we express this as a concentration property: if 
H{[Y]<t, then there is some b for which Pr[Y=b]2=27'. Let Y,,...,Y, be 
random variables, not necessarily independent, on a common space. Entropy 
satisfies the subadditivity property: 


AL(Y,,.--, YQ SALY,] +--+ ALY]. 

Let #={S,,...,5,} be a family of n sets on underlying point set 2 = 
{1,...,n}. We reconsider the problem of bounding the discrepancy A(#). Let 
Xe rs 1,+1} be random. For 1<j<n let Y, be the nearest integer to 

x(S; )120n12, Y,=0 when x(S,)<10n''’, so, by Corollary A.2, 

Pr[¥,=0]>1—-2e°”*. 
Let s>0. Y,=s requires y(S,)>(s — 4 )(20n''*), so by Theorem A.1, 
Pr Y, Sacer ; 
Pr[Y; = —s] has the same bound. We bound the entropy by the infinite sum 
HLY,J<—(l-2e ™)tog,(1-2e *)— 2 “ogre “+--)=e, 
where calculation gives c <3 x 10°”°. The subadditivity property gives 


H{(Y,,..., Y,)|en 


1802 J. Spencer 


and now the concentration property gives 
Pri(Y,,-.-,¥,) =(b,,-.-.6, p22 . 


Fix these b’s and let © be the set of y giving those b’s. As the probability 
distribution for y was uniform over the 2” possible colorings, |C| >= 2°. 

We can associate colorings y : 2—> {—1, +1} with points (y(1),..., x(#)) on 
the Hamming n-cube C” = {—1, +1}". The Hamming distance (see chapter 16) 
between y,,xX,©C" is then the number of x for which y,(x) # y,(x). A celebrated 
result of Kleitman (1966) says that if CC C" has fixed size, then the diameter of 
C is minimized when C is a ball. Some calculation gives that @ must therefore 
have diameter at least n(1-107'°). Pick y,,x,€@ at this maximal distance. 

Now set y = (x; ~ X2)/2. The possible values for x(x) are +1, —1, and 0. We 
think of x as a partial coloring — when y(x) = 0 x is uncolored. This occurs exactly 
when y,(x) = y,(x), which holds for at most 10° '°n vertices x. For each j the 
values y,(S;) and y,(5S,) lie on a common interval of length 20n'’? since, critically, 
their values Y, are the same. Thus 


Ix(S))I = Ix: (5)) i. x2(S;)[/2 <=10Vn. 


That is, there is a partial coloring of all but at most 10°'°n vertices with 
discrepancy at most 10n'’. By iterating this process, and tightening the calcula- 
tion | have shown the following result. 


Theorem 2.4 (Spencer 1985). Let ¥ be a family of n sets on n points. Then 
A(F) <6Van. 


Further results are discussed in chapter 26. 


3. Two applications in computer science 


We give here two ingenious probabilistic arguments and discuss, briefly, their 
application to computer science. In both cases our chronology is backwards: it 
was the computer science problem that motivated the probabilistic approach. 
Karp (1976) discusses probabilistic analysis of algorithms. 


Theorem 3.1. Let |V| =n and let # be an arbitrary nonempty family of subsets of 
V. Letw:V— {(1,...,2n} be a random function, each w(v) independently chosen 
uniformly over the range. For E © # set w(E) = ic, w(a). When ming <4 WCE) is 
achieved by a unique E € # we say w has a unique minimum. Then 


Pr[w has a unique minimum] > $ . 


Remark. When # is exponential the pigeonhole principle insures that there will 
be many E with equal w(E). 
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Proof. For x €V sct 


xy=—- mi E~-x)+ min w(E). 
a( ) Pisa es w( x) x@ELEEM ( ) 


Observe that evaluation of a(x) does not require knowledge of w(x). As w(x) is 
selected uniformly, 


Prla(x) = w(x). <.1/2n , 
so that 
Prla(x) = w(x) for some x EV) an/2n = 5. 
But if w had two minimal sets F,,£,€ # and x€ E, — E,, then 
wei, ME) mE). 
ei, w(E — x) = w(E,) - w(x), 


so that a(x)=w(x). 0 


Mulmulcy et al. (1987) prove this result and use it to give an RNC’ algorithm 
to construct a perfect matching in a graph. (See chapter 29 for further discussion.) 
Let us consider the simpler case when G is a bipartite graph given by ann Xn 
incidence matrix A = [a,,]. Set V= {(i, f): a, = 1}, the edges of the graph. Let 
w:V—>({(1,...,2|V|} be random and define an nxn matrix R=[r,,] by 7, = 
2°) if (i, iD E V; 0, otherwise. Apply known NC’ algorithms to calculate det(R) 
and the cofactor matrix C = [c,,]. Let W be maximal so that 2” |det(R). Let # 
denote the set of all perfect matchings E,, = {((i, ai): 1<i<n}, where o is a 
permutation on [n] with all (é, af) EV. Then 


det(R)= > sen(o)2""?. 


E,E%# 


Let M be the set of (i, ) EV so that 2” is the maximal power of 2 dividing FyjCij- 
With probability at least 1, w(E,,) has a unique minimum w({£). In that case 


det(R) = 2”) mod Qereh 


so W= w(E). Also rjc, has a +2” addend if and only if (i, 7) € E, so that M = E. 
Check if M is a perfect matching and output it if so. 

We now turn to a result of Yao (1985) on the complexity of the parity function. 
We adapt the approach of Hastad (1988) which greatly simplifies the proof and 
makes explicit the probabilistic underpinnings. 

Consider Boolean functions G on v variables x,,...,x,, where 0,1, A, v, and 
A have their usual meanings of FALSE, TRUE, AND, OR, and NOT A, 
respectively. The functions x,, x; are called atoms. A restriction is a map 
p:{1,...,2}— {0, 1, *}; the restriction of G by p, denoted G|,. is the Boolean 
function given by setting x; equal to 0 or 1 or leaving it a variable depending on 
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p(i). For example, with p(1)=0, p(2)=1, p(3)=*, and G= (x4, Ax,) Vv (QQ, A 
X,), the restriction of G by p is 


G|,=(Orx,)v (la x,)=1. 


The notation p|y has the usual meaning of the function p restricted to domain X. 
A minterm is a minimal assignment of variables to 0, 1 that forces G = 1; its size is 
the number of variables so set. In the above example there are three minterms: 
x, =0, x, =1; x, =1, x, =0; and x, =0, x, =1. 


Definition 3.2. Given p we will consider the random restriction p defined by: 
Pri) = 0]= Pr) = 1]=(1-)/2, —Prip(i)=*1=p, 


so that G|,, becomes a random function. In application p will be small. p will 
serve to homogenize G and turn it into a simple function with high probability. 
The main result is purely probabilistic in statement. 


Theorem 3.3. Let G=G,AG,A:+::AG,,, where each 


w 


G;= Yi Vert Via, » 


each y,, is an atom, and all a,;<t. Let p be the random restriction defined in 3.2. 
Let E, be the event that G|, has a minterm of size at least s. Then 


Prf[E,] < (Spt) . 


Remark. The number of conjuncts w does not affect the bound, reminiscent of 
the Lovasz Local Lemma. 


Proof. We actually show a much stronger result. Let F be any Boolean function. 
Then for the conditional probability we have 


PriEs| Fl, = 1} <(Spe)'. (3.4) 


(We use the convention that if the condition is unsatisfiable then the conditional 
probability is considered zero.) With F=1 (3.4) reduces to Theorem 3.3. We 
prove (3.4) by induction on w. When w=0, G=1 and the result is trivial. 
Assume (3.4) to hold for all values less than w. We write G = G, a G*, where 
G*=G,a-:-AG,,, and let E* be the event that G* has a minterm of size at 
least s. Interchanging the roles of x, and x, where necessary we may, for 
convenience, write G, = V ,<;x,, where |T|<¢. Let us assume |T| =1#, the other 
cases being easier. Now either G,|,=1 or G,|, 41. Assuming the former, E, 
holds only if E* holds, so 


Prf{E, | Fl, =1, G,|, = 1] = PrlEt | (Fa G,)I|,, = 1] < (Spey (3.5) 


by induction. 
Now assume G,|, #1. For nonempty Y C T and o : Y> {0, 1}, o#0, let EY” 
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be the event that G|,, has a minterm of size at least s with assignments x; <~ o(i), 
i € Y, and no other assignments with i € 7. Any minterm of G|, must force G, to 
{ and so must include such an assignment. Thus 


PoE, Fl, = 1,61, 41=Pr| VEY" Fl, =1. GL, #1 
sD PE" | F\,=1,G,|, #1). (3.6) 
There are ({) choices of Y with y elements and 2” ~ 1 choices of o. 
Remark. y = 1 supplies the main term. 


The conditions F], =1, G,|, #1 give a Bayesian distribution on p|,. G,|, 71 
means precisely that p({i) € {0, *} for all i€ 7, so that 


2 
Pr[p(i) = *| G,|, #1 = pt =p “7 = 


for all i€ T and these events are mutually independent. In particular, with 
\Y| =y, 


_[_2P | 
Prlo(Y) =*1 Gl, #1= |e 
The further condition F\, = 1 can only serve to decrease this probability. [One 
proof is via the FKG inequality. We refer to the survey of Graham (1983) for a 
discussion of this useful result.] Letting p’: {1,...,”}— Y— {0,1,*} be arbi- 
trary, we claim 


qe ee? 
Pr{p(Y) = «| Fi, =1, Gil, Al, ply beth ny-y ~P 1<|5 P| . 
Given p’, there is a unique extension p with p(Y) = *. If that p does not satisfy 
the above conditions, then the conditional probability is zero. If that p does 
satisfy the conditions, then so do all extensions p with p(i) © {0, *} for all iE Y, 
and so there is equality above. As this holds for all p’, 


2p ys 
Pro) =*1 FI, = 1,6), #11 [725 | <(2p)’. 


Remark. Given F|,=1, G,|,41, the probability that there exists an iE 7, 
p(i) = *, is at most 2pt. If, to the contrary, p(i) # * for all i€ T, then all p(i) =90 
(as G,|, #1), so G,|, =0, hence G|, =0 and E, cannot occur. With p <1/t this 
greatly limits Pr[E,| F|, =1, G,|, 41). 


Let p’ : T— {0, *}, with p’(Y) = *, be arbitrary and condition on p|; =p’. Then 
p may be considered a random restriction on {1,...,}—T. The event F |,=1 
reduces to F|, =1, F a function on x,, i¢ T. EY” occurs if and only if G has a 
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minterm of size at least s— y, where G is the reduction of G* to x,, iZ T, by p’ 


and o. Calling this event Ey sys 


PES | Fl, =1,G,|, #1, ely =e’) = PrlE,_, | Fl, =< psy” 
by induction. Any p with F|,=1, G,|,#1, p(Y)=*, must have p|,=p’ for 
some p’ of this form. Thus 
PrlEY" | Fl, = 1, Gi, 41, oY) = #1 < Spt)” 
and so 
Prey’ | Fl, =1,G,|, 41] 
= Prlo(Y) = +] Fl, =1,G,], 4 PES’ | Fl, =1, G1, 41, p(Y) = #1] 
= (2p)"(Spty” . 
Plugging in to (3.6), 


t 


PrlE,| Fl, = 1, G1, £1 > (3)2 — y@0)"(5p0" 


y= 


< espn" & (ihtseh = oo" {reg} 1) 
<(Spty’. 
Combining this with (3.5), 
Pr{E, | Fl, = 11< (Spry, 


completing the inductive proof of (3.4) and thus Theorem 3.3. O 


We give only a very rough outline of the application of Theorem 3.3 to the 
complexity of the parity function, using the language of circuit complexity. A 
sequence of Boolean functions P” on n variables is called Borel if for some 
constant k and polynomial p(n) P” is realizable by a circuit of depth k, with gates 
alternating between A and v, and maximum fan-in (i.e., number of inputs) p(n). 
Let P,, denote the parity function, P,(v,,...,%*,)=[4,+ °°: +x,] mod 2. 


Theorem 3.7 (Yao 1985). P, is not Borel. 


Proof. From our prejudiced vantagepoint Yao proves a Ramsey theorem. 
Identify Boolean functions F with two-colorings of the Hamming cube C" = 
{0, 1}". A t-face of C" is a set of 2' points, all of whose coordinates but ¢ are 
fixed. For all k,t, p(w) it is shown that if n is sufficiently large and C” is 
two-colored by a (k, p(n)) circuit, then there is a monochromatic ¢t-face. As the 
parity function has no monochromatic 1-face it cannot be Borel. 
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Fix a ( , p(n)) circuit P. For inductive purposes hold the bottom-level fan-in to 
size n®") and allow the second-level (counting fram the bottom) fan-in to be 
arbitrary. Assume the gates at the bottom level are v, otherwise use the dual 
form. Let G be the Boolean function achieved at the second level so it has the 
form of Theorem 3.3. Take a random restriction p with p=n °. The probability 
that G has a minterm of size n°” is exponentially small. As there are only 
polynomially many G, there is a restriction p’ so that all G have all minterms 
small and can be rewritten as an v of ~. Now the second and third gates are both 
v and so may be combined, reducing the depth by once. Continuing, we reduce P 
to a level-two circuit, say A of v. Fixing all of the variables on one bottom fan-in 
we can force it, and hence P, to be false — and this gives a monochromatic face in 
PO 


Remark. The Borel sequences G,, are a natural analog to classical Borel sets, 
“countable” having been replaced by “polynomial”. If we allow an arbitrary 
polynomial-time Turing machine for “computing” G,,, then might we not call such 
sequences “‘measurable’”? And if probabilistic methods have been successful in 
proving “non-Borelness”, then might we not hope that they will provide the key 
to showing certain sequences “nonmeasurable’”? A fantasy to be sure, and as 
“P= NP?” has moved to center-stage of our mathematical consciousness a fantasy 
that is particularly compelling to practitioners of the probabilistic method. 


4. Gems 


Does the heart of mathematics lie in the building of structure or in the solving of 
individual problems? Not an either—or question, to be sure, but one that is 
particularly effective in splitting the ranks of combinatorialists. Use of an 
algebraic structure to explain discrete phenomena will be central to some, to 
others grotesque. A clever argument is beautiful to the problem-solver, a curiosity 
to the structuralist. The very term ‘‘combinatorial methods” has, to this author, 
an oxymoronic character. It is the brilliant proofs, those that expand and/or 
transcend the known methodologies, which express the soul of the subject. We 
consider three examples. Our first deals with the independence number a(G) of a 
graph. 


Theorem 4.1. Let G have vertex set {1,...,n} and let d, denote the degree of 
vertex i. Then 


a(G)= > Wd, +2). 
i=1 
Proof. Let < be a random ordering of the vertex set. For 1 <i<n let A, be the 
event that all neighbors of / are greater than i in the ordering and let C=C. be 
the set of i for which A, holds. As é and tts neighbors are randomly ordered, 
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Pr[A,] = 1/(d; + 1) and so, by linearity of expectation, 
nn 
E[IC_|]= > 1/(d, +1). 
i=l 


Thus for some specific ordering |C_| =) 1/(d, + 1). Let {i, j} be any edge of G. 
Either i<j or j <i. In the first case jC. and in the second case i C_. That is, 
C.. is an independent set. 0 


Turan’s Theorem can be shown directly from this result since, fixing the total 
number of edges, ): 1/(d, + 1) is minimized when the d,’s are as nearly equal as 
possible. 

The second gem involves the tournament ranking function h(n) given at the end 
of section 1. Recall the Erdés-Moon bound 


1 2 
h(n) = > (5) ton? (n av)!” , 


given by an clementary use of probabilistic methods. In Spencer (1972), as part of 
the present author’s dissertation, the lower bound 


Ain 3/2 
h(n) > 5 (5) +cn 


was shown, with a probabilistic proof that we shall not discuss here. The gap of 
(Inn)'? was resolved by this author in 1977. Using a very complicated proof, 


1 
h(n) < z (3) + cn?? 


(with a different constant) was shown. We present this result but with a far more 
ingenious argument due to de la Vega (1983). We actually show that if T is a 
random tournament, then almost always 


I 
fer, P)=3 (5) +en* 


for all permutations P. 

Recall that because there were m2! permutations in section 1 it was required that 
the tail distribution be less than 1/n!~e~"'"" which required being (n Inn)'”” 
standard deviations off the mean. We are guided by the notion that had there 
been only K" permutations the (In)'’? term would not have appeared. 

It will be convenient to translate to a zero mean. Define the plusfit of a 
permutation P over a sct of games to be the number of games in which the 
lower-ranked player wins minus the number of games in which the higher-ranked 
player wins. For any P tet X,(P) be the plusfit over the sect of games between P(é) 
and P(j) where 1< P(i) <n/2, n/2< P(j) <n. Then X,(P) had distribution S,,, 
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where m = n°/4 is the number of games played. Recall our large-deviation bound 
PrLX,(P) >am'?} <e? 


Here, however, we do not really have n! permutations P to content with. The 
value X,(P) is determined by the partition of the players {1,...,a2}=U,UU, 
into top and bottom halves, U, = {é: Pi) <n/2}, U, = {i: Pli) > n12}. There are 
(2) <2" such partitions. Set A = (2 In 2)'?n'” so that e “7/2 9-" Then almost 
always 


X(P) < am!” = W?iQ In 2 la aa 


for all permutations P. 

Let X,(P) be the plusfit over the set of games between P(i) and P(j), where 
either 1<P(i)<n/4 and n/4<P(j)<nl2 or n/2< PGi) <3n/4 and 3n/4< 
P(j)<n. Then X,(P) has distribution S,,, where m=n’/8 is the number of 
games played. The value X,(P) is determined by the partition {1,...,n}= 
U, UU,UU, UU, of the players into four quarters and there are less than 4" such 
partitions. Set A=(21n4)'?n'? so that ¢ 772 4-" Then almost always 
X(P)<am'? = n*?((2 In 4)''7/8"7| for all permutations P. 

Let 1<k<log,n. A permutation P gives a partition of the players into 2“~ 
pairs of groups of size n2~“. Let X,(P) be the plusfit over all games “across the 
pairs”, i.e., all games between P(i) and P(j) where 


2s(n2°“) < P(i) < (2s + 1)(n2“) < P(j) < (2s + 2)(m2*) 


for some integer s, 0<s<2*~', The number of games m= n7/2**' goes down 
sharply with k. The value X,(P) is determined by the partition of the players into 
2° equal groups. There are far fewer than (2‘)” such partitions, a bound 
increasing sharply with k. The number of partitions winds up in a square-root log 
term (corresponding to the rapid decay of the tail of the normal distribution) and 
is dominated by the number of games, represented by a square-root term. More 
precisely, set A= (21n2*)'n'? so that e */? =2°“". Then almost always 


X,(P)< am"! 2 = */71(2 In ad is | C2 sane aa | 
for all permutations P. Moreover, with a little care we can show that these events 


almost always occur simultaneously for all k. 
Let X(/’) be the plusfit over all games. Almost always 


X(P)= > X,(P) <n? > (2n2"*y'27(2" YY? < 3.50! ? 
k k=l 


by some cafculation, the convergence reflecting the dominance discussed above. 
Finally, the fit f(T, P) =[X(P) + (4)]/2, so that almost all 7 have 


KY, p< (5 s) + 1.75 es 
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for all 2! permutations P. 

The final gem is duc to Paul Erd6és (1959). Over the past six decades Erdés has 
played a leading role in the rapid development of combinatorial analysis. He 
originated the probabilistic method with his 1947 paper on the Ramsey function 
R(k, k) and has steadily overseen its growth. Combining a peripatetic life-style 
with a constant admonition to “prove and conjecture’, his idcas have spread 
throughout the mathematical community. We have discussed the work of many 
other mathematicians but all of us build on the theorems and the spirit of “Uncle 
Paul”. 

Let 2(G), the girth of a graph, denote the length of the minimal cycle of G and 
let a(G) and y(G) denote, as usual, the independence number and chromatic 
number of G, respectively. 


Theorem 4.2. For all k,t there exists a finite graph G with 
x(G)2k, =g(G) =g 


This is a highly unintuitive result. If the girth is large one can construe no 
reason why the graph could not be colored with a few colors. Locally it is easy to 
three-color such a graph. We force the chromatic number up by global considera- 
tions. The most aesthetically pleasing aspect of this theorem is that probabilistic 
concepts do not enter at all into the statement, only the proof. 


Proof. Let n be very large and let G be a random graph on {1,...,n} where 
each {i, j} is placed in G with independent probability p. A t-set S is independent 
with probability (1 — py ~e is '? There are (")<n' such sets. The probability 
that a(G)=¢ is at most n’e """? =(ne ""”)'. When pt/2>(1 + o(1))(In7) this 
quantity is o(1) so that a(G)<t almost always. Critically, y(G) = n/a(G), since 
in any coloring cach color class must be an independent set. Thus y(G)> n/t 
almost always. 

An elementary approach would be to select (= n/k and p = (2+ o(1))K(Inn)/n 
so that x(G) >k almost always. This does not work by itself since G would then 
have (1)p'> c(In n)* triangles. An alteration is needed, two inches off the waist. 
Set p = (4+ 0(1))A(Inn)/n so that a(G)<n/2k almost always. Let X be the 
number of cycles of length at most g. For 3<i<g there are less than n' potential 
i-cycles, each of which is in G with probability p'. By linearity of expectation 


E(X) = = (np) <c(Inn)* , 


where c depends on g. With g fixed, E(X) = o(n) so that X <n/2 almost always. 
Almost always a(G) <n/2k and G has fewer than n/2 cycles of size at most g. Fix 
a G with these properties. Delete from G one vertex from each such cycle. The 
remaining G’ has no small cycles, at least 1/2 vertices, and a(G')<n/2k. Then 
X(G')21G'|/a(G') >k. O 
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Appendix: Bounding of large deviations 


We give here some basic bounds on large deviations that are useful when 
employing the probabilistic method. Our treatment is self-contained. Most of the 
results may be found in, or immediately derived from, the seminal paper of 
Chernoff (1952). While we are guided by asymptotic considerations the 
inequalities are proven for all values of the parameters in the specified region. 
The first result, while specialized, contains basic ideas found throughout this 
appendix. : 


Theorem A.1. Let X;, 1 <i<n, be mutually independent random variables with 


Pr[X, = +1] = Pr[X; = -1 =4 


and set, following the usual convention, 
S,=X, +--+ X,- 

Let a>0Q. Then 
Pr[S, >a]}<e07 7". 


Remark. For large n the Central Limit Theorem implies that S,, is approximately 
normal with zero mean and standard deviation Vn. In particular, for any fixed u 


~ 2 
lim Pr[S, >uvn =| ee dt, 
dias | " ] ma Qt 
which one can show directly is less than e 
again, is valid for all n and all a>0. 
We require Markov’s inequality which states: if Y is an arbitrary nonnegative 
random variable and a >0, then 


2/2 : 
“"* Our proof, we emphasize once 


PrlY¥ > @E[Y]} < le. 


Proof of Theorem A.1. Fix n,a and let, for the moment, A> 0 be arbitrary. For 
l<i<n 


Efe**'] = (e* +e ”)/2 = cosh(a) . 
We require the inequality 
cosh(A) < en! ; 
valid for all A >0, the special case a =0 of Lemma A.5 below. (The inequality 


may be more easily shown by comparing the Taylor series of the two functions 
termwise.) 
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From the definition of S, we obtain 
- n 
ern = I] et%i . 
i=l 


Since the X, are mutually independent so are the e**'. Therefore, expectations 
multiply and 


Efe**] = I E[e**'] = [cosh(a)]" < ere. 
We note that S,, > if and only if e*** > e*“ and apply Markov’s inequality so that 
Pr[S, >a] = Prfe*** > e™] <Efe* pie <eh re | 
We set A=a/n to optimize the inequality: Pr[S, > a]<e #2" as claimed. (J 
By symmetry we immediately have: 
Corollary A.2. Under the assumptions of Theorem A.1, 
Pr{|s,|>a]<2e 777". 


Our remaining results will deal with distributions X of the following prescribed 
type. 


Assumptions A.3. 


| Fern = Une oes p= (pt +p,)in, 
X,,.-.,X, mutually independent with 


Pr[X;=1—-pj=p,. PrlX,=—pj=1—p,. 
X=X,+--++X,. 
Remark. Clearly E[X] = ELX,]=0. When all p;=4, X has distribution S,,/2. 


When all p,=p, X has distribution B(n, p)—np where B(n, p) is the usual 
binomial distribution. 


Theorem A.4. Under Assumptions A.3 and with a>0 
Pr[X >a]<e 0". 

Lemma A.5. For all reals a, B, with |a| <1, 
cosh(B) + a sinh(B) <¢? 72"? . 


Proof. This is immediate if a = +1, a = ~1, or |B| 2 100. If the lemma were false 
the function 


f(a, B) = cosh(B) + @ sinh(B) — gett 


Probabilistic methods 1813 


would assume a negative global iinintun in the interior of the rectangle 
R = {(a, B): |a| <1, |B] < 100} . 

Setting partial derivatives equal to zero we find 
sinh( 8) + @ cosh(B) = (a + B) eb /2+aB : 
sinh(B) = Be®?*# 


and thus tanh(8)= £6 which implics 6 +0. But f(a, 0)=0 for all a, giving a 
contradiction. O 


Lemma A.6. For all 6 € [0,1] and all A 
oe" O41 -o)e M<er®, 
Proof. Setting 6 = (1+ a)/2 and A = 28, Lemma A.6 reduces to Lemma A.S.  O 
Proof of Theorem A.4. Let, for the moment, A> 0 be arbitrary. Then 
E[e**] =p, e+ (1—p,)e i <e? 


> 


by Lemma A.6 and 
Efe**] = "1 Efe**}<e""’* 
Applying Markov’s inequality, 
PrfX >a} = Prie** > oe“) < L]e** Jie" QA rwth Aa 


~2a7/n 


We set A= 4a/n to optimize the inequality: PrLX¥ >a]<e as claimed. O 
Again by symmetry we immediately have: 
Corollary A.7. Under Assumptions A.3 and with a>0 
Pr{|X|>a]<2e 20". 
Under Assumptions A.3, with A > arbitrary, 


Ble*}= [Tete = [1 [pe + per") 


=e "IT 1pje* + ~p)I. 


With A fixed the function 
f(x) =Infxe* + 1 — x} =In[ Be t+ 1], with B=e*-1, 
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is concave and hence (Jensen's incquality) 


2 fp.) =nf(p) . 


Exponentiating both sides gives 


[tae +C-pdsipe* + (pil, 
so that 

Efe*]<e"[pe*+(1—p)]". (A.8) 
Theorem A.9. Under Assumptions A.3 and with a>0 

Pr[X >al<ec "[pe’+(1-p)"e “ 
forall X>0. 


Proof. Pr[X > a] = Prfe** > e*”] <Efe**]/c*”. Now apply (A.8). 0 


Remark. For given p, n and a, an optimal assignment of A in Theorem A.9 is 
found by elementary calculus to be 


Ae ia (7 ale =r a) 


This value is oftentimes too cumbersome to be useful. We employ suboptimal A to 
achieve more convenient results. 
Setting A = In[1 + a/pn], Theorem A.9 reduces to 


Pr[X > a} sexp[a — pn In(1 + a/pn) — ain(i + a/pn)]. (A.10) 
Theorem A.11. 
272, 


pat al2pnay 


Prl|X >afae “ 
Proof. With u=a/pn apply the inequality 

In(l +u)2u—u?/2, 
valid for all u = 0, to the right-hand side of (A.10). UL 
Remarks. (1) When all p;=p, X has variance np(1—p). With p=o(1) and 


a=0o0(pn) this bound reflects the approximation of X by a normal distribution 
with variance ~np. 


(2) The bound of Theorem A.11 hits a minimum at @ = 2pn/3. For a> 2pn/3 
we have the simple bound 


Pr[X > a} <Pr[X > 2pn/3]<e 7?"”". 
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Theorem A.12. For B = 1 


Pr[X >(B ~l)pn|<[e*'p °)?". 
Proof. Direct “plug in” to (A.10). O 
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Remark. X + pn is the number of successes in n independent trials when the 


probability of success in the ith trial is p,. 


Theorem A.13. Under Assumptions A.3 and with a>0O 


Pr[X < -a]<e BTN 


Remark. One cannot simply employ “symmetry” as then the roles of p and 1 — p 


are interchanged. 


Proof. Let A>0 be, for the moment, arbitrary. Then 
Ble |= TT Efe = TT tae + (1 - pe" 
i-] ick 


=e" I [P; en (1~p,)]- 

With A fixed, the function 

f(x) =Infxe * + A —x)}=InfBx + 1], with B=e *—4, 
is concave. (That B is here negative is immaterial.) Thus 

> Mp.) <nfp) 
Exponentiating both sides gives 

Elo" [se [ve Fl =p)" 
analogous to (A.8). Then 

Pr[xX < -a] = Prfe “AN >e™] <e’?"Ip e A + re —p)|" e -Aa , 
analogous to Theorem A.9. We employ the inequality 

ltu<e", 
valid for all u, so that 

pe *+(1-p)=1+4(€*~1)p <exp[p(e * ~ 1)] 
and 


x 


Pr[X < —a] <exp[Apn + np(e— 


— 1) — Aa] = exp[np(e™* — 1+ A) — Aa]. 
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We employ the inequality 
e*<i-A+A7/2, 


valid for all A>0. (Note: the analogous inequality e*<1+A+A7/2 is not valid 
for A > 0 and so this method, when applied to Pr[X > a], requires an “error” term 
such as is found in Theorem A.11.) Now 


PrLX <—a}<e"™ 2, 


We set A = a/np to optimize the inequality: PrfX¥ < - a] = e"" as claimed. 1 
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1. Introduction 


In this chapter we discuss some of the ways in which topology has been used 
in combinatorics. The emphasis is on methods for solving genuine combinatorial 
problems that initially do not involve any topology ~ rather than on more the- 
oretical aspects of the combinatorics-topology connection — and the selection of 
material reflects this aim. 

The chapter is divided into two parts. In part I several examples are presented 
which illustrate different uses of topology in combinatorics. In part II we have 
gathered a number of tools which have proven useful for dealing with the topo- 
logical structure found in combinatorial situations. Also, a brief review of relevant 
parts of combinatorial topology is given. Part IT, which begins with section 9, is 
intended mainly for reference purposes. 

Among the examples in part I one can discern at least four ways in which 
topology enters the combinatorial sphere. Of course, it is in the nature of such 
comments that no rigid demarcation lines could or should be drawn. Also other 
connections exist between topology and combinatorics that follow different paths. 

(i) In the first three examples (sections 2—4) topology enters in the following 
way. First a relevant simplicial complex is identified in the combinatorial context. 
Then it is shown that this complex has sufficiently favorable properties to allow 
application of some theorem of algebraic topology, which implies the combinatorial 
conclusion, 

(ii) A different approach is seen in section 5 and in Bardny’s proof in section 4. 
There a combinatorial configuration is represented in concrete fashion in R@ or on 
the d-sphere, and a topological result (Borsuk’s Theorem) has the desired effect 
on the configuration. 

(iii) The case of oriented matroids (section 7) is unique. For these combina- 
torial objects there is a topological representation theorem, saying that oriented 
matroids are the same thing as arrangements of certain codimension one sub- 
spheres in a sphere. Of course, in this situation the topological perspective is 
always at hand as an alternative way of looking at these objects. Some non- 
trivial propertics of oriented matroids find particularly simple proofs in this 
way. 

(iv) The need for homotopy results in combinatorics sometimes arises as fol- 
lows. Say we want to define some property ¥ at all vertices of a connected graph 
G=(V,E). We start by defining # at some root node r, and then give a rule 
for how to define ¥ at v’s neighbors, having already defined it at v € V. The 
problem of consistency arises: Can different paths from r to uv lead to different 
definitions of # at v? One strategy for dealing with this is to define “elemen- 
tary homotopies”, meaning certain pairs of paths which can be exchanged without 
affecting the result (usually such pairs form small circuits such as triangles and 
squares). Then we need a “homotopy theorem” saying that any path from r to v 
can be deformed into any other such path using elementary homotopies. Tutte’s 
and Maurer’s homotopy theorems (section 6) are of this kind. From a topological 
point of view, the “elementary homotopies” mean that certain 2-cells are attached 
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to the graph, and the homotopy theorem then says that the resulting 2-complex is 
simply connected. 

Being topologically k-connected has a direct combinatorial meaning for k = 0 
(connected), and, as we have seen, also for k = 1 (simply connected). The way 
that higher connectivity influences combinatorics is more subtle; see the examples 
in sections 4 and 6. 

In section 8 a glimpse is given of the Hard Lefschetz Theorem and its appli- 
cations to combinatorics found by R. Stanley. The question here is of finding a 
complex projective variety whose topology (in the form of its cohomology ring) 
is relevant to the combinatorics at hand. This rarefied method has found a few 
striking applications. Since it deals more with algebraic-geometric matters (the 
topology is somewhat subordinate), section 8 is rather loosely connected with the 
rest of the chapter. 

Topological reasoning plays an important role in connection with several other 
topics in discrete mathematics not treated here. Among these, let us mention: 
embeddings of graphs in surfaces (see chapter 5 by Thomassen), convex poly- 
topes (see chapter 18 by Klee and Kleinschmidt and also Bayer and Lee 1993), 
arrangements of subspaces (see Orlik and Terao 1992 and Bj6rner 1994a), group- 
related incidence geometries (diagram geometries, chamber systems, posets of sub- 
groups) (sce Buekenhout 1995, Ronan 1989 and Webb 1987), computational ge- 
ometry and realization spaces (see Bokowski and Sturmfels 1989), lower bounds 
for decision and computation trees (see chapter 32 by Alon and also Bjérner 
1994a). 

Notation and terminology is explained in part II. We treat simplicial complexes 
and posets almost interchangeably. The order complex of a poset and the poset 
of faces of a complex - these two constructions take posets to complexes and vice 
versa, and no ambiguity can arise from the topological point of view. 

This chapter was written in 1988, and was revised and updated in 1989 and 1993. 


PART I. EXAMPLES 
2. Evasive graph properties 


By a graph property we shall understand a property of graphs which is 
isomorphism-invariant: if G; = Gz then G, has the property if and only if G2 
does. The following discussion will concern simple graphs having some fixed ver- 
tex set V. These graphs can be identified with the various subsets of (5)- Also, 
it is convenient to identify a graph property with the subset of the power set 202) 
which consists of all graphs having the property. A graph property  C 2%) is 
called monotone if it is preserved under deletion of edges. It is called trivial if 
either P = @ or P = 202), 

In section 4.5 of chapter 23 by Bollobas the concept of complexity (sometimes 
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called “argument complexity”) of graph properties is discussed. Also, evasive graph 
properties are defined as those of maximal complexity. The following result (stated 
as Theorem 4.5.5 in chapter 23) confirms for prime-power number of vertices 1 a 
well-known conjecture. 


Theorem 2.1 (Kahn, Saks and Sturtevant 1984). Let n= p* where p is a prime. 
Then every non-trivial monotone property of graphs with n vertices is evasive. 


We will sketch the proof of Kahn et al. to show the way in which topology is 
used. 

Suppose that card V = p*, p prime, and that ? 4 @ is a monotone nonevasive 
graph property. Y is a family of subsets of (4) closed under the formation of 
subsets - i.e., a simplicial complex. The conclusion we want to draw is that # is 
trivial, which, since PY #9, must mean that (5) € P ~ ie., topologically ¥ is the 
full simplex. 

These two facts are crucial: 


2.2. The geometric realization ||P|| is contractible. 


2.3. There exists a group I of simplicial automorphisms of # which acts transitively 
on (5) and which has a normal p-subgroup /), such that [°/T, is cyclic. 


For (2.2) one argues that the monotone property ¥ is not evasive in the algorith- 
mic sense defined above if and only if as a simplicial complex Y is nonevasive in 
the recursive sense of (11.1). By (11.1) all nonevasive complexes are contractible. 

The group IF needed in (2.3) is constructed as follows. Identify V with the 
finite field GF(p*). Let P = {x+— ax +b|a,b € GF(p*),a £0} and Ty = {x+-> 
x+b|b ¢ GF(p*)}. The assumption that # is an isomorphism-invariant property 
of graphs on V means that if y is any permutation of V — in particular, if y € I 
— then A € # if and only if y(A) € Y. Hence, Fis a group of simplicial automor- 
phisms of ¥. One checks that I" is doubly transitive on V = GF(p*) and that the 
subgroup I, has the required properties. 

By a theorem of Oliver (1975), any action of a finite group I, having a subgroup 
FV, with the stated properties, on a finite Z,-acyclic simplicial complex must have 
stationary points. Since our complex # is Z,-acyclic (being contractible), this means 
that there exists some point x € ||A|| such that y(x) =x for all y € I’. The point x 
is carried by the relative interior of a unique face G € (the lowest-dimensional 
face containing it), and the fact that x is stationary implies that y(G) = G for all 
y € I. But since I is transitive on (3) this is impossible unless G = ); Hence, 
(4) € Y, and we are done. 

It has been conjectured that all non-trivial monotone graph properties are eva- 
sive. This conjecture remains open for all non-prime-power n > 10; the n = 6 case 
was verified by Kahn et al. (1984). The evasiveness conjecture has been proven 
also for the case of bipartite graphs by Yao (1988), using the topological method. 
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3. Fixed points in posets 


A poset P is said to have the fixed-point property if every order-preserving self-map 
{:P > P has a fixed point x = f(x). It was shown by A. C. Davis and A. Tarski 
that a lattice has the fixed-point property if and only if it is complete (meaning 
that mects and joins exist for subsets of arbitrary cardinality). It has long been 
an open problem to find some characterization of the finite posets which have 
the fixed-point property. See Rival (1985) for references to work in this area. In 
the absence of such a characterization efforts have been directed toward finding 
nontrivial classes of finite posets which have the fixed point property. For this the 
Lefschetz fixed-point theorem has proved ‘to be useful. 

Let L be a finite lattice and z € L. Then y is said to be a complement of z, 
written y 1 z, if yAz =0 and yvz=1. Let €o(z) ={yeEL |y 1 z}. The lattice 
L is called complemented if Go(z) #9 for all z € L. 

A finite lattice L has the fixed point property, as is easy to see. It is more 
interesting to look at the proper part L = L — {0,1} of the lattice, which may or 
may not have the fixed point property. This is also natural from the point of view 
of lattice automorphisms, for which every nontrivial fixed point must lie in L. 


Theorem 3.1 (Baclawski and Bjorner 1979, 1981). Let L be a finite lattice and z € 
L. Then the poset L ~€0(z) has the fixed point property. In particular, if L is 
noncomplemented then L has the fixed point property. 


By Thcorem 10.15 the order complex A(L — €0(z)) is contractible, and therefore 
by Lefschetz’s Theorem 13.4 it has the topological fixed point property. From this 
the result easily follows. 

For example, let L be a finite Boolean lattice of order n. Then L has (n — 1)! 
fixed-point-free automorphisms, but the removal of any one element from L leads 
to a poset with the fixed point property. 

The preceding argument is, of course, applicable to any Q-acyclic finite poset 
[see (11.1) for some other combinatorially defined classes of such]. Also, with this 
method one can prove more about the combinatorial structure of the fixed-point 
sets Pf = {x € P |x = f(x)} than merely that they are nonempty. 

Let f: P — P be an order-preserving mapping of a finite Q-acyclic poset. Then 
the Mébius function ~ computed over P/ augmented by new bottom and top 
elements must equal zero: j4(P/) = 0. This follows from the Hopf trace formula, 
see (13.5) and the comments following it. A consequence is that for instance two 
or more incomparable points cannot alone form a fixed-point set in an acyclic 
poset. For other finite posets with the fixed point property such fixed-point sets 
are, however, possible. 

Similarly, let g: P — P be an order-reversing mapping of a finite Q-acyclic poset. 
Then the Hopf trace formula (13.2) specializes to (P,) =0, where P, = {x € 
P |x =g"(x) < g(x)}. In particular, if no x € P satifies x = g?(x) < g(x) then g has 
a unique fixed point. See Baclawski and Bjérner (1979) for further details and 
examples. 
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4. Kneser’s Conjecture 


Consider the collection of all n-element subscts of a (21+k)-element set, n > 
{,k >0. It is easy to partition this collection into k +2 classes so that every pair 
of n-sets within the same class has nonempty intersection. Can the same be done | 
with only & + 1 classes? M. Kneser conjectured in 1955 that the answer is negative, 
and this was later confirmed by L. Lovasz. 


Theorem 4.1 (Lovasz 1978). If the n-subsets of a (2n+k)-element set are parti- 
tioned into k +1 classes, then some class will contain a pair of disjoint n-sets. 


Lovasz’s proof relies on Borsuk’s Theorem 13.6 and homotopical connectivity 
arguments. Soon after Lovasz’s breakthrough a simpler way of deducing Kneser’s 
Conjecture from Borsuk’s Theorem was discovered by Bardny (1978). However, 
Lovasz’s proof method is applicable also to other situations and hence perhaps 
of greater general interest. See also chapter 24 by Frankl for a discussion of this 
result. 

Let us first sketch Barany’s proof. By a theorem of Gale (1956) (see also Schrijver 
1978), for n,k >1 there exist 21 +k points on the sphere S* such that any open 
hemisphere contains at least n of them. Partition the n-subsets of these points 
into classes €,@1,...,€- For O<i <k, let 6; be the set of all points x € S* 
such that the open hemisphere around x contains an n-subset from the class €;. 
Then (6; )ocicz gives a covering of S* by open sets. Part (i) of Borsuk’s Theorem 
13.6 implies that one of the scts, say ©,, contains antipodal points. But the open 
hemispheres around these points are disjoint and both contain n-subsets from the 
class @,. Hence, €, contains a pair of disjoint n-sets. 

For Lovasz’s proof it is best to think of the problem in graph-theoretic terms. 
Define a graph KG,,, as follows: The vertices are the n-subsets of some fixed 
(2n + k)-clement set X and the edges are formed by the pairs of disjoint n-sets. 
Then Theorem 4.1 can be reformulated: The Kneser graph KG,,, is not (k+1)- 
colorable. 

For any graph G=(V,E) let N(G) denote the simplicial complex, called the 
neighborhood complex, whosc vertex set is V and whose simplices are thosc sets 
of vertices which have a common neighbor (i.e., A € V(G) iff there exists v € V 


such that {v,a} € E for all a € A). The topology of this complex has surprising 
combinatorial content. 


Theorem 4.2 (Lovasz 1978). For any finite graph G, if N(G) is (k — 1)-connected, 
then G is not (k + 1)-colorable. 


To prove Theorem 4.1 it will then suffice to show that V(KG,,) is (k —1)- 
connected. This can be done as follows. Let P = {A CX |n<card A <n+k}. 
Ordered by containment P is a subposet of the Boolean lattice B(X) of all subsets 
of X. B(X) is shellable (11.10) (iv), hence by (11.2) and Theorem 11.14 P is (k — 1)- 
connected. Let C be the crosscut of n-element sets. By Theorem 10.8, P and the 
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24 


(d) 


Figure 1. 


crosscut complex 1(P, C) are homotopy equivalent. It follows that /(P,C), which 
is the same thing as W(KG,,,), is also (k — 1)-connected. 

The known proofs for Theorem 4.2 are more involved. A very elegant functorial 
argument was given by Walker (1983a), which we will sketch here in briefest pos- 
sible fashion. The same general argument was also found by Lovdsz (unpublished 
lecture notes) as a variation of his original proof. 

Let G=(V,£) be a finite graph. The mapping v : W(G) > N(G) defined by 
v({A) = {v € V | {v,a} € E for alla € A} has the properties 


(i) ACB implies y(A) D> v(B), and (ii) v?(A) DA. 


Let W(G) denote the order complex of the poset of fixed points of v* ordered by 
containment. Thus, 4(G) is a subcomplex of the barycentric subdivision of N(G). 
In fact, the subspace |].W(G)|| is by Corollary 10.12 a strong deformation retract of 
\|W(G)||, so W(G) and N(G) are of the same homotopy type. This construction is 
illustrated in fig. 1, where part (a) shows a graph G, (b) the neighborhood complex 
N(G), (c) its barycentric subdivision, and (d) the retract complex N(G). 
Property (i) of the mapping v : V(G) — N(G) shows that p restricts to a simpli- 
cial mapping v : W(G) —» N(G), and from property (ii) it follows that v? = identity. 
Hence, (W(G),v) (or, to be precise, ({|M(G)|\, |v ||)) is an antipodality space. Fur- 
thermore, it can be shown that every graph map (mapping of the nodes which takes 
edges to edges) ¢:G; — G» induces an equivariant map &:N(G,) + N(G2). As 
these facts suggest, the construction ’(-) sets up a functor from the category of 
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finite graphs and graph maps to the category of antipodality spaces and homotopy 
classes of equivariant maps, see Walker (1983a). For the example illustrated in 
fig. 1(d), the induced antipodal mapping of W(G) coincides with its antipodal map 
x+» —x as a circle. 

For K,,; , the complete graph on k +1 vertices, one sees that N (Kiar) = N (Kyi) 
is combinatorially the barycentric subdivision of the boundary of a k-simplex. It 
is also easy to verify that, as an antipodality space, (W’(K;,;),) is isomorphic to 
the sphere (S*~', a) with its standard antipodality map a(x) = —x. 

We now have all the ingredients for a proof of Theorem 4.2. Suppose that 
a graph G is (k + 1)-colorable. This is clearly equivalent to the existence of a 
graph map G — K,,,. Hence, we deduce the existence of an equivariant map 
N(G) > N(Ky,,) = S*~!. So by part (v) of Borsuk’s Theorem 13.6, we conclude 
that W(G), and hence N(G), is not (k — 1)-connected. 

Schrijver (1978) has shown, using Bardny’s method, that the conclusion of Theo- 
rem 4.1 remains true for the class of n-subsets that contain no consecutive elements 
i,i+1 in circular order (mod 2n + k), and that this class is minimal with this prop- 
erty. A different application of Theorem 4.2 is given in Lovasz (1983). 

The following generalized “Kneser” conjecture was made by P. Erdés in 1973 
and has recently been proved. 


Theorem 4.3 (Alon, Frank! and Lovasz 1986). Let n,t 21 and k >0. If the n- 


subsets of a (tn + (t — 1)k)-element set are partitioned into k +1 classes, then some 
class will contain ( pairwise disjoint n-sets. 


The proof is analogous to Lovadsz’s proof of Theorem 4.1. For general t-uniform 
hypergraphs H a suitable neighborhood complex €(H) is defined. It is shown that 
if ¢ is a prime and @(#) is (k(t — 1) — 1)-connected then H/ is not (k + 1)-colorable. 
To prove this for odd primes ¢ the Barany-Shlosman-Sziics Theorem 13.8 is used 
rather than Borsuk’s Theorem. It can be shown by an elementary argument that 
if Theorem 4.3 is valid for two values of ¢ then it is also valid for their product. 
Hence one may assume that ¢ is prime. See Alon et al. (1986) for the details. 

Theorem 4.3 has been further generalized by Sarkaria (1990) to involve “j- 
wise disjoint” instead of “pairwise disjoint” families of n-sets. The proof uses a 
generalized Borsuk—-Ulam theorem and the deleted join construction for simplicial 
complexes (defined in section 9). 


5. Discrete applications of Borsuk’s Theorem 


One of the most famous consequences of Borsuk’s Theorem 13.6 is undoubtedly 
the Ham Sandwich Theorem 13.7. This result, or some version of the “ham sand- 
wich” argument which leads to it (outlined in connection with Theorem 13.7), can 
be used in certain combinatorial situations to prove that composite configurations 
can be split in a balanced way. Two examples of this, due to N. Alon and coau- 
thors, will be given in this section. Also, we discuss how Borsuk’s Theorem and its 
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generalizations have been used in connection with results of “Tverberg” type. For 
other applications of Borsuk’s Theorem to combinatorics, see Barany and Lovasz 
(1982), Yao and Yao (1985), and section 4. Surveys of this topic are given by Alon 
(1988), Bardny (1993) and Bogatyi (1986). 

Suppose that 2n points are given in general position in the plane R?, half colored 
red and the other half blue. It is an elementary problem to show that the red points 
can be connected to the blue points by n nonintersecting straight-line segments. 
A quick argument goes like this. Of the n! ways to match the blue and red points 
using straight-line segments, choose one which minimizes the sum of the lengths. 
If two of its lines intersect, they could be replaced by the sides of the quadrilateral 
that they span, and a new matching of even shorter length would result. No such 
elementary proof is known for the following generalization to higher dimensions. 


Theorem 5.1 (Akiyama and Alon 1989). Let A be a set of d-n points in general 
position (no more than d points on any hyperplane) in R@. Let A = A, (J A2U---U 
Ay be a partition of A into d pairwise disjoint sets of size n. Then there exist n 
pairwise disjoint (d — 1)-dimensional simplices, such that each simplex intersects 
each set A; in one of its vertices, 1 <i < d. 


The idea of Akiyama and Alon is to surround each point p € A by a small ball of 
radius ¢, where e is small enough that no hyperplane intersects more than d such 
balls. Give each ball a uniform mass distribution of measure |/n. Then each color 
class A;, 1 <i < d, is naturally associated with its n balls, forming a measurable set 
of measure 1. By the Ham Sandwich Theorem 13.7 there exists a hyperplane H 
which simultaneously bisects each color class. Uf # is odd, then H must intersect 
at least one ball from each A;. General position immediately implies that H must 
intersect precisely one ball from each A;, and in fact bisect this ball. By induction 
on vn, the points on each side of H can now be assembled into disjoint simplices, 
and finally the points in H form one more such simplex. The argument if 1 is even 
is similar, but in that case H might have to be slightly moved to divide the points 
correctly for the induction step. 

The next example has a more “applied” flavor. Suppose that k thieves steal a 
necklace with k - n jewels. There are ¢ kinds of jewels on it, with k - a; jewels of type 
i,1 <i <e. The thieves want to divide the necklace fairly between them, wasting 
as little as possible of the precious metal in the links between jewels. They need 
to know in how many places they must cut the necklace? If the jewels of each 
kind appear contiguously on the opened necklace, then at least ¢(k — 1) cuts must 
be made. This number of cuts in fact always suffices. (Of course, what the thieves 
really need is a fast algorithm for where to place these cuts.) 


Theorem 5.2 (Alon and West 1986, Alon 1987). Every open necklace with k -a; 
beads of color i,1 <i <t, can be cut in at most t(k —1) places so that the re- 
sulting segments can be arranged into k piles with exactly a, beads of color i in each 
pile \<ict. 


The idea for the proof is to turn the situation into a continuous problem by 
placing the open necklace (scaled to length 1) on the unit interval, and then to 
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use a “ham-sandwich”-type argument there. For k = 2 this was done in Alon and 
West (1986) using Borsuk’s Theorem, The extension to general k was achieved in 
Alon (1987) using the Barany—Shlosman-Sztics Theorem 13.8. 

Radon’s Theorem, a well-known result in convexity theory, says that any collec- 
tion of d +2 points in R¢ can be split into two nonempty blocks whose convex hulls 
have nonempty intersection. This was generalized by Tverberg (1966) as follows: 
For all p > 2 and d = 1, any set of (p — 1)(d + 1) +1 points in R¢ can be partitioned 
into p blocks B,,...,B, so that conv(B,)---Nconv(B,) # @. For a quite short 
proof of Tverberg’s Theorem, see Sarkaria (1992). Results of the Radon—Tverberg 
type have generated a lot of interest, and recent work shows that in many cases 
such results rely on topological foundations that lead to formulations more general 
than the original ones in terms of convexity. See Eckhoff (1979) and Barany (1993) 
for surveys of results of this kind. 

Radon’s theorem can be obtained as a consequence of Borsuk’s Theorem, as was 
shown by Bajméczy and Bdrdny (1979). Here is the connection. Let 4“ denote the 
d-dimensional simplex. Bajméczy and Barany prove that there exists a continuous 
map g: S$“ — A+! such that the supports of g(x) and g(—x) are disjoint for every 
x €S“, Suppose now that Radon’s Theorem is false; say it fails for the points 
Yiy++-)¥as2 in R“. Define f : A“*! — R¢ by sending the ith vertex of A¢*! to y; and 
extending linearly. Then the map fog:S¢ — R¢ would violate the Borsuk—Ulam 
Theorem 13.6 (ii). 

In the preceding argument the map f could as well be an arbitrary continuous 
map (i.¢., not necessarily lincar). In a similar way, using Theorem 13.8 instead 
of Borsuk’s Theorem, Bardny, Shlosman and Sztics (1981) proved the following 
“topological Tverberg theorem”: Suppose that f : AN — R" is a continuous map- 
ping, where N = (p —1\)(d + 1) and p is prime. Then there exist p pairwise disjoint 
faces o4,...,0, of AN such that f(a) Q---Af(op) #9. Mt is still unknown whether 
the restriction to prime p is needed here in the non-linear case. See Sarkaria 
(1991b) for even more general results of this kind. 

The following result has the general flavor of Tverberg’s Theorem, and goes in 
an opposite direction from Theorem 5.1. 


Theorem 5.3 (Zivaljevié and Vre¢ica 1992). Let A= A, UA2U-:-UAy,,; De a set 
of points in R¢ partitioned into d+1 pairwise disjoint sets (color classes) of size 
|A;{| 24n— 1. Then there exist n pairwise disjoint (d+ 1)-subsets B,,...,Bn of A 
such that {A; nN B;\ =: 1 for alli, j and conv(B,) M---Nconv(B,) 4 o. 


The proof for this “colored Tverberg theorem” uses a Borsuk—Ulam-type result 
for free Z,-actions, p prime, which establishes the non-existence of an equivariant 
map from a certain “configuration space” of sufficiently high connectivity to a 
sphere of appropriate dimension. 

It has been conjectured by Barany and Larman that |A;| > 7 suffices in Theorem 
5.3. This has been proven for d = 2 by Bdrdny and Larman and for n =2 by 
Lovasz, whose proof uses Borsuk’s theorem. See Zivaljevié and Vre¢ica (1992) for 
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these references and for a fuller discussion of the status of this “colored Tverberg 
problem”. 


6. Matroids and greedoids 


This section and the next are devoted to certain topological aspects.of matroids 
and of two related structures — oriented matroids and greedoids. For the basic 
definitions see chapter 9 by Welsh. Additional topological facts about matroid 
complexes and geometric lattices are mentioned in (11.10); see also Bjérner (1992). 


Basis complexes and partitions of graphs 


The following result was proven by E. Gyéry and L. Lovasz in response to a 
conjecture by A. Frank and S. Maurer. 


Theorem 6.1 (Lovdsz 1977, Gy6ry 1978). Let G =(V, E) be a k-connected graph, 
{v1,U2,...,0,} a set of k vertices, and n,,nz,...,ny positive integers with ny + nz + 
+--+, =(V|. Then there exists a partition {V\,V2,...,V} of V such that 0; € 
V;,|Vil =n; and V; spans a connected subgraph of G,i =1,2,...,k. 


The proof of Lovasz uses topological methods, that of Gy6ry does not. At the 
end of this section Lovasz’s proof will be outlined for the case k =3 in order to 
illustrate its use of topological reasoning. It relies on the connectivity of a certain 
polyhedral complex associated with certain forests in G. Similar complexes can 
be defined over the bases of a matroid, and more generally over the bases of a 
greedoid. The greedoid formulation contains the others as special cases, and we 
shall use it to develop the general result.We begin by recalling the definition. 

A set system (/,‘¥), ¥ © 2", is called a greedoid if the foHowing axioms are 
satisfied: 

(G1) HEF, 

(G2) for all nonempty A € ¥ there exists an x € A such that A-—x€ ¥, 

(G3) if A,B € ¥ and |A| > |B|, then there exists an x € A — B such that BUx € 
F. 

If also the extra condition (G4) {fs satisfied, then (E,%#) is called an interval 
greedoid: 

(G4)if A Cc B C C where A,B,C € ¥ andAUx,C Ux € & forsomex CE E—-C, 
then also BUx € ¥. 

The sets in ¥ are called feasible and the maximal feasible sets bases. All bases 
have the same cardinality r, which is the rank of the greedoid. 

The only examples which will be of concern here are matroids (feasible sets = 
independent sets) and branching greedoids of rooted graphs (feasible sets = edge 
sets which form a tree containing the root node). Both are interval greedoids. For 
other examples and further information about greedoids, see chapter 9 by Welsh 
and the expository accounts Korte, Lovdsz and Schrader (1991) and Bjérner and 
Ziegler (1992). 
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The feasible sets of a greedoid do not form a simplicial complex other than in the 
matroid case. However, a uscful topology is given by (the order complex of) the 
poset ¥ = ¥ — {6}. ordered by inclusion. A greedoid (E, ¥) is called k-connected 
if for each A € ¥ there exists B € ¥ with A C B,|B — A| = min(k,r —|A|) and 
such that C € ¥ for every A C C C B. Matroids are r-connected, and the branching 
greedoid of a k-connected rooted graph is k-connected. 


Proposition 6.2 (Bjérner, Korte and Lovasz 1985). Let (E,¥) be a k-connected 
interval greedoid (k > 2). Then the poset of feasible sets (F,©) is (topologically) 
(k ~ 2)-connected. 


This result follows from (11.10) (iii) via Theorem 10.8, since for the crosscut C 
of minimal elements in ¥ the crosscut complies I'(¥,C) is a matroid complex of 
rank > k. 

Let & be the collection of all bases in a greedoid (E, #) of rank r. Two bases B, 
and B, are adjacent if B, 0 By € ¥ and |B, N By| =r — 1. Attaching edges between 
all adjacent pairs we get a graph with vertex set &, the basis graph. 

The shortest circuits in the basis graph can be explicitly described. There are 
two kinds of triangles and one kind of square (quadrilateral): 


6.3. Three bases AUx,AUy,AUz, where A € ¥,|A| =r —1, span a triangle of 
the first kind. 


6.4. Three bases AUxUy,AUxUz,AUyUz, where A € ¥,|A] =r-2, span a 
triangle of the second kind. 


6.5. Four bases AUxUu,A Ux Uv, AUy Un,A Uy Up, where A € ¥, |Al =r — 2, 
span a square. 


For branching greedoids triangles of the second kind cannot occur. 

Now, attach a 2-cell (a “membrane”) into each triangle and square. This gives 
a 2-dimensional regular cell complex 2%, which we call the basis complex. 

It is a straightforward combinatorial exercise to check that the basis complex of 
any 2-connccted greedoid of rank < 2 is 1-connected (i.c., connected and simply 
connected). For rank 2 (the only non-trivial case) this follows directly from the 
exchange axiom (G3). In higher ranks the following is true. 


Theorem 6.6 (Bj6rner, Korte and Lovasz 1985). The basis complex X of any 3- 
connected interval greedoid is 1-connected. 


In order to illustrate some of the tools given in part II, we give a short proof 
of this. Let P be the poset of closed cells of % ordered by inclusion, and let Q be 
the top three levels of (¥, C), i.e., the feasible sets of ranks r—2,r—1 andr. Let 
f:P —@Q be the order-reversing map which sends each cell 7 to the intersection 
of the bases which span 7. By Proposition 6.2 and Lemma 11.12 the poset Q is 
1-connected, so by Theorem 10.5 we only have to check that the fibers f~'(Q5,) 
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are I-connected for all A € Q. But if r(A) =r —i,i =0,1,2, then f '(Q.,,) is the 
basis complex of the rank i greedoid obtained by contracting A, and we have 
already checked that basis complexes of rank < 2 greedoids are 1-connected. 

Let P = B,B,--- By and Q = BaBa, ---B, be paths in the basis graph of a ma- 
troid, and let PQ = B, B,--- ByBa,,--- By be their concatenation. Say that paths 


PQ and PRQ differ by an elementary homotopy if R is of the form BCB, BCDB 
or BCDEB with B = By. 


Theorem 6.7 (Maurer 1973). Let P and P’' be any two paths with the same end- 
points in the basis graph of a matroid. Then P can be transformed into P' via a 
sequence of elementary homotopies. 


Maurer’s “Homotopy Theorem” 6.7 is clearly a combinatorial reformulation of 
Theorem 6.6 in the matroid case. An application to oriented matroids will be given 
in the next section. 

The time has come to return to Theorem 6.1. The following outline of the proof 
for the k = 3 case is quoted from Lovadsz (1979) (with some adjustments in square 
brackets to better suit the present discussion): 

“So let G be a 3-connected graph, v,,v2,u,; € V(G) and 1, +12 +n; = |V(G)I- 
Take a new point a and connect it to v1, v2, and v3. Consider the topological space 
# constructed for this new graph G’. [In our language, % is the basis complex 
of the branching greedoid determined by the rooted graph (G’,a). This greedoid, 
whose bases are the spanning trees of G’, is 3-connected.] For each spanning tree 
T of G’, let f(T) denote the number of points in 7 accessible from a along the 
edge (a,v;)(i = 1,2). Then the mapping 


f:Te AT), AM) 


maps the vertices of % onto lattice points of the plane. Let us subdivide cach 
quadrilateral 2-cell in Jf by a diagonal into two triangles; in this way we obtain 
a triangulation % of #. Extend f affinely to each such triangle so as to obtain a 
continuous mapping of into the plane. Obviously, the image of % is contained in 
the triangle A= {x >0,y >0,x+ y <n}. We are going to show that the mapping 
is onto A. 

“Let us pick three spanning trees, 7), 72, 73 first such that f(7)) = (#,0), f(72) = 
(0,2), f(T3) = (0,0). Obviously, such trees exist. Next, by applying [the fact that 
the basis graph of a 2-connected greedoid is connected] to the graph G’ — (a,v3), 
we select a polygon P,2 in % connecting 7; to 7) and having f3(x) = 0 at all 
points. Thus f(Pi2) connects (1,0) to (0,7) along the side of the triangle A with 
these endpoints. Let P23; and P3, be defined analogously. 

“By Theorem 6.6, P\2 + P23 + P3, can be contracted in X to a single point. There- 
fore, f(Pi2) + f(P23) + f(P31) can be contracted in f(%) to a single point. But ‘ob- 
viously’ (or, rather, by applying the well-known fact [Brouwer’s Theorem 13.1} that 
the boundary of a triangle cannot be contracted to a single point in the triangle 

one interior point taken out), f(3() must cover the whole triangle A. So in 
the point (n,,n2) belongs to the image of %, and therefore it belongs to 


Topological methods 1833 


the image of a triangle of 9. But it is easy to sec that this implics that (1; , 2) is 
the image of one of the vertices of 7; i.e., there exists a spanning tree 7 with 


S(T) =n, f(T) = no. 


The three components of T — a now yield the desired partition of V(G).” 

Theorem 6.6 is a special case of a morc general result saying that for any k- 
connected interval greedoid a certain higher-dimensional basis complex is (k — 
2)-connected. This more general result implies Theorem 6.1 for arbitrary k by 
extension of the ideas we have just seen in the k =3 case. See Lovasz (1977) and 
Bjérner, Korte and Lovasz (1985) for complete details. 


Tutte’s Homotopy Theorem 


A matroid is called regular if it can be coordinatized over every field. In Tutte 
(1958) a characterization is given of regular matroids in terms of forbidden mi- 
nors. The proof relies in an essential way on a “Homotopy Theorem”, expressing 
the i-connectivity of certain 2-dimensional complexes. Tutte’s Homotopy Theorem 
was also used by R. Reid and R. Bixby to prove the forbidden minor characteriza- 
tion for representability over GF(3). More recently other proofs of these results, 
avoiding use of the Homotopy Theorem, have been found by P. Seymour and 
others. See chapter 10 by Seymour for an up-to-date account. 

TJutte’s Homotopy Theorem seems to be the oldest topological result of its kind 
in combinatorics. Unfortunately it is quite technical both to state in full and to 
prove. Here we shall state the Homotopy Theorem in sufficient detail that the 
nature of the result can be understood. Complete details can be found in Tutte 
(1958) and Tutte (1965). 

Let L be a finite geometric lattice of rank r, and write L' for the set of flats of 
rank i; so L’~' is the set of copoints, L’~* the colines and L’~? the coplanes. Flats 
X € 1. will be thought of as subsets of the point set L! via ¥ = {pe L' |p < X}. 

Given any point a € L' we define a graph TG(L,a) on the vertex set LY,' = 


{X € L'"! | X Za} as follows: two copoints X and Y “off a” (ie., in the set LG, 


span an edge if X A Y isa coline and X UY # L' — a. On this graph we construct a 
2-dimensional regular cell complex TC(L, a) by attaching 2-cclls into the triangles 
and squares of the following kinds: 


6.8. Triangles XY ZX for which rk(X¥ AY AZ) >r—3. 


6.9. Squares XYZTX for which rk(P) =r ~3, where P= X AYAZAT, and 
either the coline P Va is covered by exactly two copoints or else the interval [P, i] 
is isomorphic to the lattice of flats of the Fano matroid F7 minus one of its points. 


If L has no minor isomorphic to F7, the dual of the Fano matroid, then (6.8) 
and (6.9) describe all the 2-cells of the Tutte complex TC(L,a). [This means that 
for use in representation theory the definition (6.8)-(6.9) of TC(L, a) is sufficient. 
In general it is necessary to attach 2-cells also into certain squares XY ZT X for 
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which rk(X AY AZ AT) =r — 4. The definition of these squares (of the “corank 
4 kind”) is fairly complicated, so we refrain from describing them here. 


Theorem 6.10 (Homotopy Theorem, Tutte 1958). The complex TC(L,a) is 1- 
connected. . 


The combinatorial meaning of Theorem 6.10 is that any two copoints X and 
Y “off a” can be connected “off a” by a path in the Tutte graph TG(L, a), and 
that any two such paths differ by a sequence of elementary homotopies of type 
XYX,XYZX asin (6.8), or XY ZTX as in (6.9) or of the corank 4 kind. (Compare 
the discussion preceding Theorem 6.7.) 

The given formulation of the Homotopy Theorem differs in form but not in 
content from the statement in Tutte (1958). Tutte has remarked about his theorem 
(Tutte 1979, p. 446) that “the proof ... is long, but it is purely graph-thcorctical 
and gcometrical in nature. | am rather surprised that it seems to have acquired a 
reputation for extreme difficulty”. No significant simplification of the original proof 
seems to be known, other than in special cases. One such case is if X UY 4 L' ~—a 
for all pairs X,Y of copoints “off a” such that X A Y is.a coline. Then the top 
three levels of L - fa, i] form a poset which is I-connected by (11.10) (iv), (11.2) 
and Theorem 11.14, and the 1-connectivity can be transferred to TC(L,a) by an 
application of the Fiber Theorem 10.5, similar to the proof of Theorem 6.6. A 
simpler and more conceptual proof of Tutte’s Theorem in full strength would be 
of definite interest. 

Unfortunately the available space does not permit a thorough explanation of 
how Theorem 6.10 is used in representation theory. Here is a briefest possible 
sketch of the idea. Tutte’s proof of sufficiency for his characterization of regular 
matroids runs by induction on the size of the ground set (that is why it is of interest 
to delete the point a). Roughly speaking, the “regular” coordinatization lives on 
the copoints, and its value at the new point a is extended from one copoint in LS 
to another via paths in the Tutte graph TG(L, a). The Homotopy Theorem is then 
needed to check that different paths do not lead to contradictions. A similar idea 
is illustrated in greater detail in the proof of Theorem 7.6. 


7. Oriented matroids 


Two topics from the theory of oriented matroids will be discussed in this scc- 
tion. Most important is the topological representation theorem of Folkman and 
Lawrence (1978), which states that every oriented matroid can be realized by an 
arrangement of pscudospheres. As an application we show how such realizations 
lead to quick proofs of some combinatorial properties of rank 3 oriented ma- 
troids. Second, we sketch (following Las Vergnas 1978) how Maurer’s Homotopy 
Theorem 6.7 can be used to deduce the existence of a determinantal sign function. 

Oriented matroids are defined in chapter 9 by Welsh. Since we will use a slightly 
different formulation of the concept (due to Folkman and Lawrence 1978) and 
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need to refer to the linear case for motivation, we will start with a quick review 
of the basics, which will also serve to fix notation. More extensive treatments can 
be found in the monographs Bachem and Kern (1992) and Bjérner, Las Vergnas, 
Sturmfels, White and Ziegler (1993). 

Let E be a finite set with a fixed-point free involution x + x* (ie., x* Ax = x 
for all x € E). Write A* = {x‘ |x € A}, for subsets A C FE. An oriented matroid 
0 = (E, x, 6) is such a set together with a family € of nonempty subsets such that 

(OM1) @ is a clutter (ie., Ci; # Cy implies Cy Z C2 for all C,, Cz € €); 

(OM2) if C € @ then C* € € and CNC* = 9; 

(OM3) if C), Co € @,C, A Cy and x € C, NC, then there exists D € © such that 
DGC UC, — {x,x*}. : 

The sets in € are called circuits of the oriented matroid ©. For elements x € E 
let ¥ = {x,x*}, and let A = {x|x € A},A CE, and € = {C|C € €}. The system 
€ satislics the usual matroid circuit-exchange axioms, so 0 (E,@) is «a matroid, 
called the underlying matrvid of ©. Not all matroids arise from oriented matroids 
in this way; those that do are called orientable. A subset B C E is called a basis of 
0 if B is a basis of 6. The rank of © equals the rank of ©. Without significant loss 
of generality we will make the tacit assumption in what follows that all oriented 
matroids arc simple, meaning that no circuit has fewer than three elements. 

The fundamental models for oriented matroids are sets of vectors in R” and 
the relation of positive linear dependence (or, more generally, positive linear de- 
pendence of vectors over any ordered field). Suppose that E is a finite subset of 
fe? — {0} such that EF = —E, and if x #y in E are parallel then y = —x. For x € E 
let x* = —x. A subset A C E is positive linearly dependent if X.caA,x = 0 for some 
real coefficients A, > 0, not all equal to zero. Let € be the family of all inclusion- 
wise minimal positive linearly dependent subsets of FE, except those of the form 
{x,x*},x € E. Equivalently, € consists of all subsets of E which form the vertex 
set of a simplex of dimension > 2 containing the origin in its relative interior. Ori- 
ented matroids (E£,*,€@) which arise in this way are called linear (or, realizable) 
over R. Not all oriented matroids are isomorphic to linear ones. 


Topological Representation Theorem 


To pave the way for the Representation Theorem for oriented matroids it is best 
to look at the linear case for motivation. The Representation Theorem in fact says 
that intuition gained from the linear case is going to be essentially correct (mod- 
ulo some topological deformation which cannot be too bad) for general oriented 
matroids. 

Let E be a finite subset of R¢ — {0} such that E = —E, and let 6 = (E,*, €) 
be the linear oriented matroid as previously discussed. For each e € E = {* = 


- {x,x*} |x © E}, let H, be the hyperplane orthogonal to the line spanned by e. 


The arrangement of hyperplanes # = {H, |e € E} contains all information about 
©, since one can go from H, back to a pair of opposite normal vectors, and the 
definition of the sets which form circuits in © (ie., the sets in @) is independent 
of the length of vectors. By intersecting with the unit sphere S“ | we can alterna- 
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tively look at the arrangement of spheres & = {H.AS* '|e € E}, which is merely 
a collection of equatorial (d — 2)-spheres inside the (d — 1)-sphere. Clearly: linear 
oriented matroids (up to reorientation), arrangements of hyperplanes and arrange- 
ments of spheres are the same thing. 

When thinking about a linear oriented matroid (E,*,@) as an arrangement of 
spheres it is useful to visualize elements x € E as closed hemispheres H, = {y € 
S4-'1 (y,x) > 0}. Then a subset A C E belongs to © if and only if AN A* = @ and 
A is minimal such that U,,, 4. = S47! 

We shall need the following terminology. A sphere 5° is a topological space 
for which there is a homeomorphism f:S! - 53> with the standard j-sphere 
S! = {x € RM! | |[x|] =1}, for some j 20. A pseudosphere S in > is any image 
S = f({x € S’ | x;,; =0}) under such a homeomorphism. [In the topological liter- 
ature pseudospheres are known as “tamely embedded (or, flat) codimension-one 
subspheres”, cf. Rushing (1973).] The two sides (or, pseudohemispheres) of S are 
S'=f({x eS! | xj 2 0}) and Ss = f({x € S’ | xj45 <0}). Clearly, S is the inter- 
section of its two sides, which are homeomorphic to balls. 

The crucial definition is this: An arrangement of pseudospheres (E, A) in $4" is 
a finite collection of = {S, |e € E} of distinct pseudospheres S, in S‘~' such that 

(API) Every nonempty intersection $4 == Qc 4 SecA © FE, is a sphere. 

(AP2) For every nonempty intersection S4 and all e € E, either S,4 C S, or 
S48, is a pseudosphere in S4 with sides S, VSt and S,nS,. 

This definition is due to Folkman and Lawrence (1978). They actually required 
more, but the additional assumptions in their definition were proved to be redun- 
dant by Mandel (1982). 

In analogy with the linear case (arrangement of spheres), an arrangement of 
pseudospheres (E,.) gives rise-to a system O(sf) = (£,+,€) as follows: put 
E = {St |e € E} U{S; |e F}, let (St)* = Sy and vice versa, and define € to be 
the collection of the minimal subsets A C E such that JA = S*"' and AN A* = 0. 
It turns out that O(s¢) is an oriented matroid (in spite of the topological defor- 
mations). What is more surprising is that the construction leads to all oriented 
matroids. We call an arrangement © essential if 4 = 0. 


Theorem 7.1 (Representation Theorem, Folkman and Lawrence 1978). 

(i) If sf is an arrangement of pseudospheres in S4~', then (sd) is an oriented 
matroid. Furthermore, if A is essential then rank O(A) = d. 

(ii) If © is an oriented matroid of rank d, then O = 0(S) for some essential 
arrangement of pseudospheres in set 

(iii) The mapping sf — O(SA) induces a one-to-one correspondence between rank 
d oriented matroids and essential arrangements of pseudospheres in S4~', up to 
natural equivalence relations. 


The proof of this result is quite involved. For part (ii) a poset is first constructed 
from the oriented matroid, and then it is shown using Theorem 12.6 that this poset 
is the poset of faces of some regular cell complex €.This complex € provides the 
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(d - 1)-sphere and various subcomplexes the (d ~ 2)-subspheres forming the ar- 
rangement. The sphere © is constructible (Edmonds and Mandel 1978, Mandel 
1982), and even shellable (Lawrence 1984), which implies that the whole construc- 
tion of @ and the relevant subcomplexes can be carried out in piecewise linear 
topology. In particular, this means that no topological pathologies need to be dealt 
with in representations of oriented: matroids.-Complete proofs of Theorem 7.1 
can be found in Folkman and Lawrence (1978), Mandel (1982), and Bjérner, Las 
Vergnas, Sturmfels, White and Ziegler (1993). 

The Representation Theorem shows that oriented matroids of rank 3 correspond 
to arrangements of “pseudocircles” on the 2-sphere or, in the projective version, 
arrangements of pseudolines in the real projective plane. This representation can 
be used for quick proofs of some combinatorial properties as in the following 
application. 


Theorem 7.2. Let M be an orientable matroid of rank 3. Then: 
(i) M has a 2-point line, 
(ii) if the points of M are 2-colored there exists a monochromatic line. 


Here is how Theorem 7.2 follows from Theorem 7.1. Represent the points of M 
as pseudocircles on the 2-sphere. Then fines are maximal collections of pseudocir- 
cles with nonempty intersection (which is necessarily a 0-sphere, i-e., two points). 
The arrangement of pseudocircles gives a graph G whose vertices are the points of 
intersection and edges the segments of pseudocircles between such points. Since 
this graph lics embedded in S? it is planar, and since rk(M ) = 3 it ts simple. We 
need the following lemma. 


Lemma 7.3. For any planarly embedded simple graph: 

(i) some vertex has degree at most five, 

(ii) if the edges are 2-colored then there exists a vertex around which the edges of 
each color class are consecutive in the cyclic ordering induced by the embedding. 


Part (i) is a well-known consequence of Euler’s formula (cf. chapter 5 by 
Thomassen). Part (ii) is also a consequence of Euler’s formula, but not as well 
Known. It was used by Cauchy in the proof of his Rigidity Theorem for 3- 
dimensional convex polytopes. 

To finish the proof of Theorem 7.2, look at the graph G determined by the 
arrangement of pseudocircles. If all lines in M have at least 3 points, then every 
vertex in G will have degree at Ieast 6, in violation of (i). If the pseudocircles are 
2-colored and through every intersection point there is at least one pseudocircle 
of each color, then the induced coloring of the edges of G will violate (ii). 

The proof of the first part of Theorem 7.2, a generalization of the Sylvester- 
Gallai Theorem (see chapter 17 by Erdés and Purdy), has been known since the 
1940s in the linear case. ‘The tollowing strengthening by Csima and Sawyer (1993) 
also uses pseudoline representation: The number of 2-point lines in M is at least 
&(card M). The proof of the second part, due to G.D. Chakerian in the linear 
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case, was rediscovered by Edmonds, Lovasz and Mandel (1980), who also observed 
the gencralization to oriented matroids. 


Basis signatures 


Just like ordinary matroids, oriented matroids can be characterized in several ways. 
We shall discuss a characteristic property of the set of bases B of an oriented ma- 
troid, namely that a determinant can be defined up to sign (but not magnitude). 
This was first shown by Las Vergnas (1978). Characterizations of oriented ma- 
troids in terms of signed bases were also discovered by J. Bokowski, A. Dress, L. 
Gutierrez-Novoa and J. Lawrence. 

Let us review some essential features of the function 6: — {+1,—1}, taking 
ordered bases of a linear oriented matroid (E, +, @), E C R%, to the sign of their 
determinants. A function 7 can be defined for certain pairs of ordered bases B 
and B’ in R¢ as follows: 


7.4. Suppose B and B' are permutations of the same basis B. Let n(8, B’) = +1 if 
they are of the same parity and = —1 otherwise. 


7.5. Suppose B = x4x2---x, yy and PB’ = x4x7---x,. 1z with y # z. Let n(B, B’) = +1 
if y and z are on the same side of the hyperplane spanned by {xj,...,x,,}, and 
= —1 otherwise. 


Now, once we choose an ordered basis By) and put det(fp) := +1, the function 
det(B) and its sign 6(B) is determined for all ordered bases B by the usual rules of 
linear algebra. But the function 5(8) is also combinatorially determined, because 
any pair of ordered bases can be connected by a chain of steps of type (7.4) 
or (7.5) and we have: If B and B’ are ordered bases as in (7.4) or (7.5) then 
5(B) = n(B, B’) - 8(B’). 

The preceding discussion points the way how to generalize the determinantal 
sign function to all oriented matroids. First, to cast (7.5) in a form which is more 
compatible with the axiom system (OM 1)-(OM 3), we replace it by the following 
reformulation: 


7.5’. Suppose B = x14X2---X,_yy and Bl = x,x2---x,. 12 with y 4 z, and if y # z* 
let {C,C*} be the unique pair of circuits such that in the underlying matroid 
{7,Z} CEC {%,...,¥,,9,Z}. Put (B, B’) = +1 if one of y and z lies in C and the 
other in C*, and put (B, B’) = --1 otherwise. 


Theorem 7.6 (Las Vergnas 1978). Let R be the set of ordered bases of an oriented 
matroid, and let By € B. There exists a unique function 5: B — {+1,—1} such that 
5(By) = +1 and if B,B’ € B are related as in (7.4) or (7.5') then 6(B) = n(B, B’)- 
5(B’). 


The proof runs as follows. Define a graph on the vertex set &% by connect- 
ing pairs {B, B’} which are related as in (7.4) or (7.5') by an edge. The graph is 
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clearly connected, and there is a projection 7:8 — B to the basis graph % of the 
underlying matroid. Now, put 6(B)) := +1, and for B € & define 


8(B):= [] (6; 1,8) 
ist 


for some choice of path Bo, Bi,..-, Bn = 8B in B®. The proof is complete once we 
show that this definition is independent of the choice of path from fp to B. If 
P, and P, are two such paths then by Theorem 6.7 their projections 7(P,) and 


a(P2) in the basis graph differ by a sequence of elementary homotopies. Thus the 
checking is reduced to verifying 


k 
[[ ata. 1,@;)—=1 


f=1 


for closed paths ay, ay, ...,a@% = a in B whose projection in B is an edge BCB, 
triangle BCDB or square BCDEB. However, the basis configurations which give 
triangles or squares in the basis graph are explicitly characterized in (6.3)-(6.5), 
and this way the checking is brought down to a manageable size. See Las Vergnas 
(1978) for further details. 


8. Discrete applications of the Hard Lefschetz Theorem 


One of the most esoteric results to have found applications in combinatorics is the 
Hard Lefschetz Theorem. It was used by R. Stanley to prove the Erdés—Moser 
conjecture (chapter 32 by Alon) and to show necessity in the characterization of 
f-vectors of simplicial convex polytopes (chapter 18 by Klee and Kleinschmidt). 

In this section we will state the Hard Lefschetz Theorem and briefly explain 
how it is used for these applications. The presentation follows Stanley (1980a,b, 
1983b, 1985, 1989). Other applications appear in Stanley (1987a,b). 

Unfortunately, concepts must be used here which go beyond what is reviewed 
and explained in part II of this chapter. In particular we must assume some fa- 
miliarity with the singular cohomology ring of a topological space, and with a few 
basic notions of algebraic geometry (projective varieties, smoothness, etc.). See 
Hartshorne (1977) for this. 

Let X be a smooth irreducible complex projective variety of complex dimen- 
sion d, and let H*(X) = H°(X) @ H'(X) @---@ H*4(X) denote its singular coho- 
mology ring with real cocfficients. Recall that if w € /'(X) and 7 € H/(X) then 
wT € H'*/(X). Being projective, we may intersect X with a generic hyperplane H 
of an ambient projective space. By a standard construction in algebraic geometry 
the subvariety X 1 H represents a cohomology class w € H?(X). 


Theorem 8.1 (The Hard Lefschetz Theorem). Let X and w € H?(X) be as above, 
and let0 <i <d. Then the linear map H'(X) — H*"‘(X) given by multiplication 
by w*~! is an isomorphism of vector spaces. 
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See Stanley (1983b) for references to various proofs of this theorem (the first 
rigorous one is due to W. Hodge). Note that the fact that H‘(X) and H24-'(X) 
are isomorphic is known already from Poincaré duality. Thus the point of the 
theorem is entirely the existence of a special cohomology class w with such fa- 
vorable multiplicative properties. Whereas Poincaré duality is a purely topological 
phenomenon (valid for all compact orientable manifolds, and in various versions 
also more generally), the Hard Lefschetz Theorem uses smoothness in an essential 
way. There is not (as far as is known) any intrinsically topological construction of a 
good cohomology class w that would make Theorem 8.1 valid for some reasonable 
class of topological manifolds. Nevertheless, the Hard Lefschetz Theorem has been 
extended to some more general classes of varietics, e.g., to Kahler manifolds in 
differential topology and to V-varieties (nonsmooth varieties with finite quotient 
singularities, e.g,, the toric varieties of simplicial polytopes discussed below). 

Stanley’s (1980a) proof of the Erdés—-Moser conjecture is outlined in section 9 of 
chapter 32 by Alon. Referring to the discussion there, and using the same notation, 
we will now indicate how Theorem 8.1 is used. 

For a certain posct M(n) of rank N = ("3') and with rank-level sets M(n);,i = 
0, 1,...,N, let V; be the real vector space with basis M(n);. For the proof 
it is needed to construct linear mappings g;:V;— Vj,; such that the compo- 
sition @y_j-.1° Pn. j-.20°::0G@:V; > Vn_; is invertible, for 0 <i < [N/2], and if 
xe M(n); and o(x) = Dveaon,,cv Y» then cy #0 implies y > x. 

Take the special orthogonal group G = SOy,;(C) and let P be the maximal 
parabolic subgroup corresponding to the simply-laced part of its Dynkin diagram. 
Then G/P is a smooth irreducible complex projective variety having a cell decom- 
position (in a certain algebraic-geometric sense) such that the poset of closed cells 
is isomorphic to M(n). This cell decomposition of G/P (induced by the Bruhat 
decomposition of G) has cells only in even dimensions, and we may identify M (n); 
with the set of 2i-dimensional cells and conclude that V; = H7/(G/P). The rele- 
vance of Theorem 8.1 is now becoming clear; indecd, letting the lincar mapping 
gi: Vi — Vis, be multiplication with @, all required properties turn out to hold. 

The poset M (n) is a member of a class of finite rank-symmetric posets arising as 
Bruhat order on Weyl groups and on their quotients modulo parabolic subgroups. 
Using Theorem 8.1, Stanley (1980a) showed that all such posets are rank-unimodal 
and satisfy a strong form of the Sperner property. 

Many of the results of Stanley (1980a), including the proof of the Erd6s—Moser 
conjecture, can be proven with just linear algebra, see Proctor (1982). This is done, 
essentially, by rewriting the first proof (including a proof of the Hard Lefschetz 
Theorem) as concretely as possible and throwing out all mention of algebraic 
geometry. 

We now turn to the characterization of f-vectors of simplicial polytopes. This 
application of Theorem 8.1 uses more of its content. The fact that the linear map- 
pings ¢; constructed above are given by multiplication is irrelevant for the previous 


argument, whereas the global multiplicative structure of #7*(X) is essential in what 
follows. 
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We refer to chapter 18 by Klee and Kleinschmidt for definitions relating to 
simplicial d-polytopes P and their h-vectors h(P) = (fo,A),...,4@). AS observed 
there, every simplicial polytope in R@ is combinatorially equivalent to one with 
vertices in ", 

Let P be a d-dimensional convex polytope with vertices in Q”. There is a general 
construction (see Ewald 1995, Fulton 1993 or Oda 1988) which associates with P 
an irreducible complex projective variety X(P) of complex dimension d, called a 
toric variety. This variety is in general not smooth, not even in the simplicial case. 

Suppose now that P is simplicial. Then the following is true [work of V.I. Danilov, 
J. Jurkiewicz, M. Saito and others; see the cited books or Stanley (1983b, 1985, 
1987a)]: 

(i) the cohomology of X(P) vanishes in all odd dimensions, and dimp H7‘(X(P)) 
h(P), for i=0,1,...,d. 

(ii) H*(X(P)) is generated (as an algebra over R) by H?(X(P)), 

(ui) the Hard Lefschetz Theorem 8.1 holds for X=X(P) and the class of a 
hyperplane section w € H?(X). 

It follows from (iii) that the mapping H77(X) > H29(X) given by multiplica- 
tion with w is injective if i < d/2 and surjective if i > [d/2]. Therefore, taking the 
quotient of the cohomology ring 


H'(X) = Of. HX) 
by the ideal generated by w, we get a graded ring 


R=H'(X)/(o) = OR, 
where R; = H7'(X)/@H?'-?(X), for i > 1, and Ro = H°(X) & R. Furthermore, R is 
generated by R, [by (ii)], and dime R; = h; — hj_, [by (i) and (iii)]. This shows that 
(ho, hy — ho, hz — hy, --- jaz} ~ Aya/2j-1) is an “O-sequence”, as defined in Theorem 
6.2 of chapter 18 by Klee and Kleinschmidt. As explained after Theorem 6.5 of 
that chapter, this is precisely what needs to be shown to complete the proof of 
necessity of the characterization of f-vectors of simplicial polytopes. 

A more elementary (and self-contained) proof of necessity has recently been 
found by McMullen (1993). He replaces the cohomology ring of the toric variety 
by a certain subalgebra of the polytope algebra and proves the needed analog of 
the Hard Lefschetz Theorem using convex geometry. 

In Stanley (1987a) sharp lower bounds are given for the differences A; —fj_1,1 < 
i < [d/2], for a centrally symmetric simplicial d-polytope. The proof involves the 
interaction between the Hard Lefschetz Theorem and a finite group action. 

The toric variety X=X(P) of a non-simplicial polytope P with rational vertices 
is unfortunately more difficult to use for combinatorial purposes. For instance, 
dimg H'(X) may depend on the embedding of P and not only on its combinatorial 
type, and cohomology may fail to vanish in odd dimensions. However, the inter- 
section cohomology (of middle perversity) //*(X), defined by M. Goresky and R. 
MacPherson, turns out to be combinatorial and to satisfy a module version of the 
hard Lefschetz theorem. This leads to some interesting information for general 
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rational polytopes, such as Theorem 6.8 of chapter 18 by Klee and Kleinschmidt. 
See Stanley (1987b) for more information. 


PART H. TOOLS 


The rest of this chapter is devoted to a review of some definitions and results 
from combinatorial topology that have proven to be particularly useful in combi- 
natorics. The material in sections 9 (simplicial complexes), 12 (cell complexes) and 
13 (fixed-point and antipodality theorems) is of a very general nature and detailed 
treatments can be found in many topology books. Specific references will therefore 
be given only sporadically. Most topics in sections 10 and 11, on the other hand, are 
of a more special nature, and more substantial references (and even some proofs) 
will be given. 

Many of the results mentioned have been discussed in a large number of papers 
and books. When relevant, our policy has been to reference the original source 
(when known to us) and some more recent papers that contribute simple proofs, 
extensions or up-to-date discussion (a subjective choice). We apologize for any 
inaccuracy Or omission that may unintentionally have occurred. 


9. Combinatorial topology 


This section will review basic facts concerning simplicial complexes. Good general 
references are Munkres (1984a) and Spanier (1966). Basic notions such as (topo- 
logical) space, continuous map and homeomorphism will be considered known. 


Throughout this chapter, every map between topological spaces is assumed to be 
continuous, even if not explicitly stated. 


Simplicial complexes and posets 


9.1. An (abstract) simplicial complex A = (V, A) is a set V (the vertex set) together 
with a family A of nonempty finite subsets of V (called simplices or faces) such 
that 0 #4 ao C7 € A implies o € A. Usually, V = JA (shorthand for V = U,.40) 
so V can be suppressed from the notation. 


The dimension of a face o is dimo = card o - 1, the dimension of A is dim A ~ 
max,ca dima. A d-dimensional complex is pure if every face is contained in a 
d-face (i.e., d-dimensional face). The complex consisting of all nonempty subsets 
of a (d + 1)-element set is called the d-simplex. 

Note that our definition allows the empty complex A = @. It is, by convention, 
(—1)-dimensional. (Remark: ‘The definition of a simplicial complex (with nonempty 
faces) that we use here is the standard one in topology. In combinatorics it is usually 
more convenient to allow the empty set as a face of a complex; in particular, this 
is consistent with the definition of reduced homology.} 
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Let A‘ = {k-faces of A} and A<* = U,., A’, for k > 0. The elements of A° = V 
and A! are called vertices and edges, respectively. If A is pure d-dimensional the 
elements of A? are called facets (or chambers). A<* is the k-skeleton of A. It is a 
subcomplex of A. ; 

A (geometric) simplicial complex is a polyhedral complex in R¢ [in the sense 
of (12.1)] whose cells are geometric simplices (the convex hull of affinely inde- 
pendent point-sets). If I is a geometric simplicial complex then the family of 
extreme-point-sets of cells in 1’ form an abstract simplicial complex A(I’) which 
is finite. Conversely, if A # # is a d-dimensional finite abstract simplicial complex 
then there exist geometric simplicial complexes [in R24*! such that A(I’) & A. The 
underlying space (J I’ of any such I, unique up to linear homeomorphism, is called 
the geometric realization (or space) of A, denoted by ||Al]. Conversely, A is called a 
triangulation of the space {| Al], and of every space homeomorphic to it. Thus, ab- 
stract and geometric simplicial complexes are equivalent notions in the finite case 
(and more generally, when finite-dimensional, denumerable and locally finite). The 
geomctric realization ||A]| of arbitrary infinite abstract simplicial complexes A can 
be constructed as in Spanier (1966). 

A simplicial map f:A, -» 42 is a mapping f:A)— AY such that f(u) € Ap 
for all o € A;. By affine extension across simplices it induces a continuous map 
WF: All — [All 

Whereas the rectilinear realization of all d-dimensional simplicial complexes 
in R24"! is easy to prove (and 2d+1 is best possible), the existence in special 
cases of rectilinear and of topological realizations in spaces R/, for d < j < 2d, are 
difficult and much studied problems. For d= 1 this is the question of planarity 
of graphs (see chapter 5 by Thomassen), for rectilinear embeddings when d 2 2, 
see, e.g., Bokowski and Sturmfels (1989) and the references found therein, and for 
topological embeddings see Rushing (1973). It is for instance not known whether 
every triangulation of the 2-dimensional torus has a rectilinear embedding into R°. 
A classical result concerning topological embeddings is the van Kampen—Flores 
Theorem (from 1932-33), which says that the d-skeleton of a (2d +2)-simplex 
does not embed into R*4. Sarkaria (1991b) gives an up-to-date discussion of this 
result in a setting which also includes the topological Radon-Tverberg theorems 
discussed in section 5, see also Sarkaria (1991a). 


9.2. Let P =(P,<) be a poset (partially ordered set). A totally ordered subset 
Xy <X, < +++ < xy is called a chain of length k. The supremum of this number over 
all chains in P is the rank (or length) of P. 1f all maximal chains have the same 
finite length then P is pure. P is a lattice if every pair of elements x,y € P has a 
least upper bound (join) x V y and a greatest lower bound (meet) x Ay. 


For x € P, let P..,P..,P-,,P-, be defined by P,, = {y € P:y > x}, etc. For 
x <y define the open interval (x,y) = Ps, {\P<y and the closed interval |x, y| = 
Ps, Py. A bottom element 0 and a top element i in P are elements satisfying 
0 < x (respectively x < 1) for all x € P. If both 6 and i exist, P is bounded. Then 
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P = P — {0,1} denotes the proper part of P.. For arbitrary poset P, P= Pcup{6, i} 
denotes P extended by new top and bottom elements (so, card (P\P) = 2). 

Let P be a pure poset of rank r. For x € P, let r(x) = rank(P<x). The rank 
function r: P — {0,1,...,r} is bijective on each maximal chain. It decomposes P 
into rank levels P' = {x € P: r(x) =i},0<ic<r. 


9.3. The face poset P(A) = (A,C) of a simplicial complex A is the set of faces 
ordered by inclusion. The face lattice of A is P(A) = P(A) U {6,1}. It is a lattice. 
P(A) is pure iff A is pure, and rank P(A) = dim A. 


The order complex A(P) of a poset P is the simplicial complex on vertex set P 
whose k-faces are the k-chains xy <x) <--- <x, in P. A poset map f: P,; > P2 
which is order-preserving [x <y implies f(x) < f(y)] or order-reversing |x < y im- 
plies f(x) > f(y)] is simplicial f: A(P,) — A(P2), and therefore induces a continu- 
ous map |f]]: ]JA(@?1)|] > ]AU?2)|]. The definition of AV?) goes back to Aleksan- 
drov (1937). 

For a simplicial complex A, sdA = A(P(A)), is called the (first) barycentric subdi- 
vision (due to its geometric version). A basic fact is that A and sdA are homeomor- 
phic. Therefore, passage between simplicial complexes and posets via the mappings 
P(-) and A(-) does not affect the topology, and from a topological point of view 
simplicial complexes and posets can be considered to be essentially equivalent 
notions. 

The geometric realization ||P |] = ||A(?)|| associates a topological space with ev- 
ery poset P. In this chapter, whenever we make topological statements about a 
poset P we have the space ||P || in mind. 

There exists at least one other way of associating a useful topology with a poset 
P (also due to Aleksandrov 1937), namely, let the order-ideals (subsets A C P 
satisfying x < y € A implies x € A) be the open sets of a topology on P. Denote 
this space T(P). For instance, for the poset depicted to the right in fig. 2 (section 
12), T(P) is a space with exactly ten open sets, whereas A(P) is homeomorphic 
to the 2-sphere. For the ideal topology T(-) the continuous maps are precisely 
the order-preserving maps and homotopy [see (9.10)} has a direct combinatorial 
meaning. For instance, 7(P) is contractible iff P is dismantlable in the sense of 
(11.1); see Stong (1966). The ideal topology 7'(P) is relevant for sheaf cohomology 
over posets (Baclawski 1975, Yuzvinsky 1987) and has surprising connections with 
the order complex topology A(P) (McCord 19664). 


9.4. Let T be a topological space, + an equivalence relation on T, and 7: T — 
T /> the projection map. The quotient T/~ is made into a topological space by 
letting A C T /= be open iff 7 !(A) is open in T. Hf S;,i € J, are pairwise disjoint 
subsets of T, then T/(S;)j.; denotes the quotient space obtained by identifying the 
points within each set $;,i € 1. For example, cone(T’) = T x {0, 1]/(7 x {1}) is the 
cone over T, and susp(T) = T x [0,1]/(T x {0}, T x {1}) is the suspension of T. 
The d-ball modulo its boundary is homeomorphic to the d-sphere: B“4/S“"' = $4. 
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If (7;,x;)ic1,X; € T;, is a family of pointed pairwise-disjoint spaces, then the 
wedge of this family is U,., Ti/(Ujc;{*i}). The join of two spaces T, and T> is the 
space T; * T, = T, x Tz x (0, 1]/({(¢, x,0) |x € M2}, {0,5, ) Ly € Tiber sen: 

The join of two simplicial complexes A, and A, (with 4) n A} = 9) is the com- 
plex A, * 4) = 4; UA,U {aU 7|o € A, and 7 € Ap}. Further, the cone over A and 
suspension of A are the complexes cone(A) = A * I, susp(A) = A I, where T;; is 
the 0-dimensional complex with i vertices, i = 1,2. There is a homeomorphism 


IAs * Aol] = [|i] * Aol. (9.5) 


[In case A, and A, are not locally finite the topology of the right-hand side may 
need to be modified to the associated compactly generated topology, see Walker 
(1988).] In particular, ||cone(A)|| = cone(||Al]) and {jsusp(A)|| = susp(||A]|). 

The join of two complexes A, and A; has the following geometric realization. 
First realize A, and A) in the same space R“, with d sufficiently large, so that two 
distinct line segments (x, x2] and [y), y2| with x,,y; € {|Aj|| and x2, yo € |] Aol} never 
intersect in an interior point. Then take the union of all such line segments (with 
the topology induced as a subspace of R") — this gives ||A, * Aol]. 

The p-fold deleted join AY? of a simplicial complex A is defined as follows. Let 
Ai,.--,4p be disjoint copies of 4 with isomorphisms f;: 4; — A. Then A”? is the 
subcomplex of A, *---* 4p, consisting of all faces 0, U---U op such that fi(o;)N 
f(a) =9 for all i 4 j. For combinatorial uses of this construction see Sarkaria 
(1990, 1991a,b) and Zivaljevi¢ and Vrecica (1992). 

The direct product P x Q of two posets is the Cartesian product set ordered by 
(x,y) < (x’,y’) if x <x’ in P and y < y’ in Q. The join (or ordinal sum) P * Q of 
two posets is their disjoint union ordered by making each element of P less than 
each element of Q and otherwise keeping the given orderings within P and Q. 
Clearly, A(P « Q) = A(P) * A(Q). 

There are the following homeomorphisms (Quillen 1978, Walker 1988): 


IP x Ql = Pll « (Ql, (9.6) 
ICP x Q)-ceypll = Poll * Qe yl, (9.7) 
(x.y), (x,y) S susp (lx, x)* HO. yD, 

if x <x' in P andy <y’ in Q. (9.8) 


(Again, special care has to be taken with the topology of the right-hand sides if 
the participating order complexes are not locally finite.) 


9.9. Let A be a simplicial complex and o € AU {@}. Then define the subcomplexes: 
deletion dl4(o) = {7 € A| tT No =9}, star sta(o) = {rt € A| TUa@ € A} and link 
Ik,(o) = {7 € A|7No =9 and TUG € A}. Clearly, dl(a) Nst(o) = Ik(a) and o * 
Ik(a) = st(o). If o € A® then also dl(a) Ust(o) = A; and dl(@) = st(#) = 1k(@) = A. 
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Homotopy and homology 


9.10. Two mappings fo, fi: 71 > Tz of topological spaces are homotopic (written 
fo~fi) if there exists a mapping (called a homotopy) F :T, x [0,1] > Tz such 
that F(t,0) = fo(t) and F(t, 1) =f, (0) for all ¢ € 7. (Remember that all mappings 
between topological spaces are assumed to be continuous.) The spaces 7; and 77 
are of the same homotopy type (or are homotopy equivalent) if there cxist mappings 
fi: T, — Tz and fp: T, — T, such that f2 of; ~idz, and f\ 0 fp ~ id;z,. Denote this 
by 7; ~ T). A space which is homotopy equivalent to a point is called contractible. 


Let §4 ' = {x € RA | |lx|] -- 1} and BY = {x © R4| [}x|| <1} denote the standard 
(d — 1)-sphere and d-ball, respectively. Note that S~' = 0, S° = {two points} and 
B® = {point}. The class of spheres and balls is closed under the operation of taking 
joins (up to homeomorphism): S* «S$? = S™?*! | Bt» Bo ~ BY *S° = BUH 

A space T is k-connected if for allO <i <k each mapping f:S' — T can be 
extended to a mapping f: B‘'!  T such that f(x) = f(x) for all x € S$’. In partic- 
ular, 0-connected means arcwise connected. The property of being k-connected is 
a homotopy invariant (i.e., is transferred to other spaces of the same homotopy 
type). S“ is (d — 1)-connected but not d-connected (see Theorem 13.1), B’ is con- 
tractible. It is convenient to define the following degenerate cases: (—1)-connected 
means “nonempty”, and every space (whether empty or not) is k-connected for 
k < ~2. 

A simplicial complex A is contractible iff 4 is k-connected for all k > 0 (or equiv- 
alcntly, for all 0 < k < dim). (The corresponding statement for gencral spaccs is 
false in the nontrivial direction.) Furthermore, a simplicial complex is k-connected 
iff its (k + 1)-skeleton is k-connected. 

Let 1;(T) = 7(T,x) denote the set of homotopy classes of maps f:5' —> T 
such that f((1,0,...,0)) =x, from the pointed i-sphere to a pointed topological 
space (7',x),x¢ 7, i 20. For i 2 1 there cxists a composition that makes 7,(7) 
into a group, the ith homotopy group of T (at the point x). For i > 2, the group 
a;(T) is Abelian. 7,(T) is the fundamental group, and T is simply connected if 
a(T) = 0. The space T is k-connected iff 7;(7,x) =0 for allO <i < k and x € T. 
So, 1-connected means simply connected and arcwise connected. 


9.11. For the definitions of simplicial homology groups H,(A,G) and reduced sim- 
plicial homology groups H;(A,G) of a complex A with coefficients in an Abelian 
group G, we refer to Munkres (1984a) or Spanier (1966). 


Let H(A) = H;(A,Z). The degenerate case 
Say w) & i=l, 
rid) = { 0, ix-1, 


should be noted. For A ¥ 0, H;(A) = 0 for all i < 0 and alli > dim A, and Fiy(A) & 
Z-', where c is the number of connected components of A. Hj(A) = H;(A) for all 
i # —1,0; H_|(4) =0 and H(A) = H(A) ® Z. 
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Let A, and A) be finite complexes and assume that at least one of H(A) and 
H1,(A)) is torsion-free when p +q =i ° 1. Then 


His(A, * Ao) = ED (Ap(Ar) ® Hg(A2)). (9.12) 


prq=i 


‘The same decomposition holds (without any restriction) for reduced homology 
with coefficients in a field. See Milnor (1956) or chapter V of Cooke and Finney 
(1967) for further details. 

For a finite simplicial ies A let Bj = rank H,(A) = dimg H(A, Q),i > 0. 
The Betti numbers B; satisfy the Fuler-Poincaré formula 


So (-1) card(4’) = S0(-1)'6.- (9.13) 


i20 i20 


Either side of (9.13) can be taken as the definition of the Euler characteristic 
x(A). The reduced Euler characteristic is ¥(4) = x(A) — 1. Formula (9.13) is valid 
with 6; = dim, H;(A,k) for an arbitrary field k, although the individual integers 
B; may depend on k. Additional relations exist between the face-count numbers 
f, =card(A‘) and the Betti numbers 8; (Bjérner and Kalai 1988). Much is known 
about the f-vectors f(A) = (fo, fi,-.-) for various special classes of complexes A. 
See chapter 18 by Klee and Kleinschmidt for the important case of polytope bound- 
aries, and Bjérner and Kalai (1989) for a survey devoted to more general classes 
of complexes. 

The MGbius function of a (locally) finite poset is defined in chapter 21 by Gessel 
and Stanley. Theorem 13.4 of that chapter (duc to P. Hall) can in view of (9.13) 
be restated as 


(x,y) = ¥(A((x,y))), if x<y, (9.14) 


where the right-hand side denotes the reduced Euler characteristic of the order 
complex of the open interval (x, y). This connection between the Mébius function 


and topology, first pointed out by Rota (1964) and Folkman (1966), has many 
interesting ramifications. 


9.15. Two complexes of the same homotopy type have isomorphic homology groups 
in all dimensions. A complex A is k-acyclic over G if H;(A,G) =0 for all i < k. 
So, (—1)-acyclic means nonempty and 0-acyclic means nonempty and connected. 
Further, 4 is acyclic over G (or simply “G-acyclic” if confusion cannot arise) if 
H,(A,G) = 0 for all i¢ Z. When G is suppressed from the notation we always 
meannG=Z ° 


We now list some relations between homotopy properties and homology of a 
complex A, which are frequently useful. They are consequences of the theorems 
of Hurewicz and Whitehead (see Spanier 1966). 


9.16. A is k-connected iff A is k-acyclic (over Z) and simply connected, k > 1 
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9.17. A is contractible iff A is Z-acyclic and simply connected. 


9.18. If A is simply connected, H;(A) = 0 for i #d > 1, and H(A) & Z*, then A is 
homotopy equivalent to a wedge of k d-spheres. 


9.19. Assume dim A = d > 0. Then A is (d — 1)-connected iff A is homotopy equiv- 
alent to a wedge of d-spheres. 


[Remark: The analogues of (9.17)-(9.19) may fail for non-triangulable spaces.] 


9.20. If A, is k,-acyclic and A) is k2-acyclic then A, * Ap is (k, + ky + 2)-acyclic. This 
follows from (9.12). Using (9.16) it implies that if A; is k;-connected then A, * A) 
is (k, + ky + 2)-connected. (For this, see also Milnor 1956.) 


10. Combinatorial homotopy theorems 


In this section we collect some tools for manipulating homotopies and the ho- 
motopy type of complexes and posets, which have proven to be useful in com- 
binatorics. Parallel tools for homology exist in most cases. We begin with some 
elementary lemmas. 

Suppose A is a simplicial complex and T a space. Let C:A— 2" be order- 
preserving (i.e., C(a) C C(t) CT, for all a Cz in A). A mapping f:||Al| - T is 
carried by C if f(\|o||) C C(a) for all o € A. Let k € Z, U {oo}. 


Lemma 10.1 (Carrier Lemma). Assume that C(a) is min(k, dim(o))-connected for 
alla € A. Then: 

(i) if f,g : ||4S*|| — T are both carried by C, then f ~g, 

(ii) there exists a mapping ||A“**'|| > T carried by C. 


In particular, if C(a) is always contractible then [|A]| can replace the skeleta in 
(i) and (ii) (kK = oo case). Carrier lemmas of various kinds are common in topology. 
For proofs of this version, see Lundell and Weingram (1969) or Walker (1981b). 


Lemma 10.2 (Contractible Subcomplex Lemma). /f Ag is a contractible subcom- 
plex of a simplicial complex A, then the projection map ||A\] — ||A||/||Aol| is a ho- 
motopy equivalence. 


This is a consequence of the homotopy extension property for simplicial pairs 
[for more details see Brown (1968) or Bjérner and Walker (1983)]. 


xa 10.3 (Gluing Lemma). Examples of simple gluing results for simplicial 
‘es A, and Ay) are: 


‘< and A, A) are contractible, then A, U A; ~ Ap, 
snd Az are k-connected and A, Az is (k —1)-connected, then A, U A> 


ond A, Az are k-connected, then so are also A, and Ap. 


Sy 
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Such results are often special cases of the theorems in this section, especially 
Theorem 10.6. Otherwise they can be deduced from the Mayer-—Vietoris long ex- 
act sequence (for k-acyclicity) and the Seifert-van Kampen theorem (for simply- 
connectedness), using (9.16) and (9.17). . 

A general principle for gluing homotopies appears in Brown (1968, p. 240) and 
Mather (1966). It gives a convenient proof for part (i) of the following lemma. 
For part (ii) use Lemma 10.2. A more general method for gluing homotopies (the 
“diagrams of spaces” technique) appears in Ziegler and Zivaljevié (1993). 


Lemma 10.4. Let A = AjU A, U---U A, be a simplicial complex with subcomplexes 
Aj, and assume that A; 4; C Ap forall | <i<j <n. 
(i) If A; is contractible for all 1 <i <n, then 


A~ AjU U cone(Ap M A;) 


il 


(i.e, raise a cone independently over each subcomplex Ag N Aj). 
(ii) If A; is contractible for allO0 <i <n, then 


A = wedge, <;<, SUSp(Ap M Aj). 


Some of the following results concern simplicial maps f : A — P from a simplicial 
complex A to a poset P. Such a map sends vertices of A to elements of P in such a 
way that each o € A is mapped to a chain in P. In particular, an order-preserving 
or order-reversing mapping of posets Q —> P is of this type. 


Theorem 10.5 (Fiber Theorem, Quillen 1978, Walker 1981b). Let f:4— P be a 
simplicial map from a simplicial complex A to a poset P. 

(i) Suppose all fibers f '(Ps,),x © P, are contractible. Then f induces homotopy 
equivalence between A and P. 

(ii) Suppose all fibers f-'(Ps,),x € P, are k-connected. Then A is k-connected if 
and only if P._ is k-connected. 


Proof. Suppose that all fibers are contractible. Then the mapping C(a) = 
f-'(Pominc), & € A(P), is a contractible carrier from A(P) to |{Al]. By Lemma 
10.1 (ii) there exists a continuous map g: A(P) — A carried by C, ie., g(lo||) 
If-'(Psmine)||, for every chain o € A(P). One sees that g is a homotopy inverse 
to f as follows, using Lemma 10.1 (i): C’(a) = ||[Psmine|l,o € 4(P), is contractible 
and carries f og and idp, and C"(7) = |[f~'(Psming(m ||, 7 € A, is contractible and 
carries go f and idy. Hence, fog ~idp and gof ~ id. 

The second part is proved analogously by passing to (k + 1)-skeleta and using 
k-connected carriers in Lemma 10.1. 


The nerve of a family of sets (A;)j¢; is the simplicial complex W = N(A;) defined 
on the vertex set / so that a finite subset o C / is in'\W precisely when Nyc, A; 4 0. 
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Theorem 10.6 (Nerve Theorem, Borsuk 1948, Bjérner et al. 1985, 1994). Ler A 
be a simplicial complex (or, a regular cell complex) and (Aj)jc; a family of sub- 
complexes such that A=Jj;-, 4 

(i) Suppose every nonempty nile intersection Aj, .A;,N--- Aj, is contractible. 
Then A and the nerve N(A;) are homotopy equivalent. 

(ii) Suppose every nonempty finite intersection A; .4;,0--- dj, is (k —t+1)- 
connected.Then A is k-connected if and only if N(A; ) is k- connected 


Proof. For convenience, assume that the covering of A by the A,’s is locally finite, 
meaning that each vertex of A belongs to only finitely many subcomplexes A;. (The 
case of more general coverings requires a slightly different argument.) 

Let Q = P(A) and P = P(.N(A;)) be the face posets. Define a mapping f: Q — P 
by w+ {i € 1 | 7 € Aj}. Clearly f is order-reversing, so f : A(Q) — P is simplicial. 
The fiber ato € P is f '(P.,) = (ica Ai- Part (i) now follows from Theorem 10.5. 
Also, if all nonempty finite intersections are k-connected, part (ii) follows the same 
way. In the stated generality, part (ii) is proved in Bjorner et al. (1994). O 


The Nerve Theorem has several versions for coverings of a topological space 
by subspaces. The earliest of these seem to be due to Leray (1945) and Weil 
(1952). Discussions of results of this kind can be found in Wu (1962) and McCord 
(1967). We state here a version which seems suitable for use in combinatorics. An 
application to oriented matroids appears in Edelman (1984). 


Theorem 10.7 (Nerve Theorem, Weil 1952, Wu 1962, McCord 1967). Let X be a 
triangulable space and (A;)jc; a locally finite family of open subsets (or a finite 
family of closed subsets) such that X = (Uj<; Ai. If every nonempty intersection Aj, 
A;, Q-++A;, is contractible, then X and the nerve N(A;) are homotopy equivalent. 


By locally finite is meant that each point of X lies in at most finitely many sets 
A;. We warn that Theorem 10.7 is false for locally finite coverings by closed sets 
and also for too general spaces X. For a counterexample in the first case, take X 
to be the unit circle and A; = {e?™|1/(i+ 1) <¢ <1/i},i=1,2,.... In the second 
case one can, e.g., le. X be the wedge of two topologist’s combs A, and A) [as in 
Spanier (1966, Ex. 5, p. 56)]. 

The conclusions in part (ii) of Theorems 10.5 and 10.6 can be strengthened: 
In Theorem 16.5, if all fibers are k-connected, then f induces isomorphisms of 
homotopy groups 7;(4) = aj(P), for all j < k. Consequently, if in Theorem 10.6 
all nonempty finite intersections Aj, 4;,--- 4, are k-connected, then 7;(A) = 
aj(N(A;)), for all j < k. A similar k-connectivity version of Theorem 10.7 appears 
in Wu (1962). 

Let P be a poset. A subset C C P is called a crosscut if (1) C is an antichain, (2) 
for every finite chain o in P there exists some element in C which is comparable 
to each element in a, (3) if A C C is bounded (here meaning that A has an upper 
bound or a lower bound in P) then the join VA or the meet AA exists in P. For 
instance, the atoms of a lattice L of finite length form a crosscut in LZ and in L. 
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A crosscut C in P determines the simplicial complex ['(P,,C) consisting of the 
bounded subsets of C. 


Theorem 10.8 (Crosscut Theorem, Rota 1964, Folkman 1966, Bjérner 1981). 
The crosscut complex [(P ,C) and P are homotopy equivalent. 


Proof. For x € C, let A; = A(P<,U Psx). Then (Ax),cc is a covering of A(P), by 
condition (2), and every nonempty intersection is a cone, by condition (3), and 
hence contractible. Since F(P,C) = V(A,), Theorem 10.6 implies the result. © 


The neighborhood comptex of a graph defined in section 4 is a special kind 
of nerve complex. The following result gives a special decomposition property of 
neighborhood complexes of bipartite graphs. 


Theorem 10.9 (Bipartite Relation Theorem, Dowker 1952, Mather 1966). Suppose 
G=(Vo,Vi,E), E C Vo x Vj, is a bipartite graph, and let A;,i =0,1, be the simpli- 
cial complex whose faces are all finite subsets o © V; that have a common neighbor 
in V, ;. Then Ay and A, are homotopy equivalent. 


Proof. First delete any isolated vertices from G. This does not affect Ag and A). 
Now, for every x € V; let A, consist of all finite subsets of {y € Vo | (y,x) € E}. 
Then (Ax)xcy, is a covering of Ap with contractible nonempty intersections. The 
nerve of this covering is A;, so Theorem 10.6 applies. 0 


Theorems 10.6 (i), 10.8 and 10.9 are equivalent in the sense that either one 
implies the other two. The following is a variation of the Fiber Theorem 10.5. 


Theorem 10.10 (Ideal Relation Theorem, Quillen 1978). Let P and Q be posets 
and suppose that R GC P x Q is a relation such that (x,y) < (x',y’) € R implies that 
(x,y) € R. (That is, R is an order ideal in the product poset.) Suppose furthermore 
that R, = {y € Q| (x,y) € R} and R, = {x € P | (x,y) € R} are contractible for all 
x € P andy € Q. Then P and Q are homotopy equivalent. 


Proof. By symmetry it suffices to show that P and R are homotopy equivalent. 
By Theorem 10.5 it suffices for this to show that the fiber 7~!(P:,) is con- 
tractible for all x € P, where 7: R — P is the projection map a(x, y) = x. Let Fy = 
aw '(P.,) = ((z,y)ER|z 2 x}, and let p: Fy -» R, be the projection p(z,y) = y. 
Now, p-'((Rr)zy) = {(2,w) € Fx | w >y} = {(z,w) € R| (z,w) > (x/y)} is a cone 
and hence contractible, for all y € R,. So by the Fiber Theorem F, is homotopy 
equivalent to R,, which by assumption is contractible. (Remark: There is also an 
obvious k-connectivity version of this result.) 0 


Theorem 10.11 (Order Homotopy Theorem, Quillen 1978). Let f,g: 4 — P be sim- 
plicial maps from a simplicial complex Ato a poset P. If f(x) < g(x) for every vertex 
x of A, then f and g are homotopic. 
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Proof. For each face o € A, let C(a) = f(a) Ug(a). The minimal element in the 
chain f(a) is below every other element in C(a). So the order complex of C(c) 
is a cone, and hence contractible. Since C carries both f and g, these maps are 
homotopic by Lemma 10.1. O 


Corollary 10.12. Let f: P — P be an order-preserving map such that f(x) 2x for 
all x € P. Then f induces homotopy equivalence between P and f(P). 


If also f?(x) = f(x) for all x € P (f is then called a closure operator on P) then 
f(P) is a strong deformation retract of P. The hypotheses of Theorem 10.11 and 
Corollary 10.12 can be weakened to that f(x) and g(x) [resp., f(x) and x] are 
comparable for all x. 

Calla poset P join-contractible (via p), if for some element p € P the join (least 
upper bound) p V x exists for all x € P. Define meet-contractible in dual fashion. 


Corollary 10.13 (Quillen 1978). If P is join-contractible then P is contractible. 


Proof. Since x < pV x 2 p, for all x € P, Theorem 10.11 shows that id ~ pV id ~ 
p, i.e., the identity map on P is homotopic to the constant map p. O 


The following is a consequence of Corollary 10.12, and also of Theorem 10.8. 


Corollary 10.14, Let L be a lattice of finite length and A the set of its atoms. Let 
J ={VB|B CA). Then Land LOJ are homotopy equivalent. 


Proof. The mapping f(x) = V(AN Le,) satisfies f?(x) = f(x) <x for all x € L. 
Now use Corollary 10.12. O 


The set of complements €0(z) of an element z in a bounded lattice L is defined 
in section 3. Recall that L = L — {0,1}. 


Theorem 10.15 (Homotopy Complementation Theorem, Bjérner and Walker 1983). 
Let L be a bounded lattice and z € L. , 

(i) The poset L — €0(z) is contractible. In particular, if Lis noncomplemented 
then L is contractible. 

(ii) If €o(z) is an antichain, then 


L ~ wedge susp(L., * bey. 
yeGo(z) 


Proof. For each chain o in P = L— €o0(z), let C(o)={xEP|x>2z}Ufye 
P|y<maxo}. Either z Vmaxo exists in P, in which case C(o) is meet- 
contractible via it, or else z Amaxo exists, and C(o) is join-contractible via it. 
So, C is contractible and carries the constant map z as well as idp. Therefore 
by Lemma 10.1 z ~ id,, which proves part (i). Part (ii) then follows by Lemma 
10.4 (ii). O 


= OTE 
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Suppose that L is a bounded lattice whose proper part is not contractible. 
Then by part (i) every element x has a complement in L. This conclusion can 
be strengthened in the following way: (Lovasz and Schrijver (unpublished)] Every 
chain xy <x; << +++ <x, in L has a complementing chain yg > yi > --- > yx (i-e., 
x; Ly; for 0<i<k). Here one can even demand that each complement y; is a 
join of atoms (assuming that atoms exist, which is the case, e.g., if L is of finite 
length). 

A more general poset version of Theorem 10.15 is given in Bjérner (1994b). 
There the antichain assumption is dropped from part (ii) at the price of a more 
complicated description of the right-hand side as a quotient space of a wedge 
indexed by pairs x < y in €o(z). 


11. Complexes with special structure 


Some special properties of complexes that are frequently encountered in combi- 
natorics, and which express a certain simplicity of structure, will be reviewed. 


Collapsible and shellable complexes 


11.1. Let A be a simplicial complex, and suppose that o € A is a proper face of 
exactly one simplex 7 € 4. Then the complex A’ = A\{o, rT} is obtained from A by 
an elementary collapse (and A is obtained from A’ by an elementary anticollapse). 
Note that A’ ~ A. If A can be reduced to a single point by a sequence of elementary 
collapse steps, then A is collapsible. 


The class of nonevasive complexes is recursively defined as follows: (i) a single 
vertex is nonevasive, (ii) if for some x € A° both Ik,(x) and dl4(x) are nonevasive, 
then so is A. 

The following logical implications are strict (i.e., converses are false): 


cone => nonevasive ==> collapsible => contractible — Z-acyclic. 
Furthermore, for an arbitrary field k: 
Z-acyclic => k-acyclic => Q-acyclic => ¥ = 0, 


and Z-acyclic <=> Z,-acyclic for all prime numbers p. 

Nonevasive complexes were defined by Kahn et al. (1984) to model the notion 
of argument complexity discussed in section 2. A complex A is nonevasive iff for 
all F C A® it is possible in less than card A® questions of the type “Is x € F ?” to 
decide whether F € A. 

Collapsibility has long been studied in combinatorial topology. Noteworthy is the 
fact that two simply connected finite complexes A and A’ are homotopy equivalent 
iff a sequence of elementary collapses and elementary anticollapses can transform 
A into A’ (see Cohen 1973). In particular, the contractible complexes are precisely 
the complexes that collapse /anticollapse to a point. 


1854 A. Bjérner 


An element x in a poset P is irreducible if P.., has a least element or P-, a 
greatest element. A finite poset is dismantlable if successive removal of irreducibles 
leads to a single-element poset. A dismantlable poset is nonevasive. A topological 
characterization of dismantlable posets of Stong (1966) is mentioned in (9.3). A 
directed poset (for all x, vy © P there exists z € P such that x, y < z) is contractible. 


11.2. Let A be a pure d-dimensional simplicial complex, and suppose that the k-face 
a is contained in exactly one d-face 7. Then the complex A’ = A\{y|] a C y C T} is 
obtained from A by a (k,d)-collapse. If o # 7, then A’ ~ A. If A can be reduced to 
a single d-simplex by a sequence of (k,d)-collapses, 0 < k < d, then A is shellable. 


A pure simplicial complex A is vertex-decomposable if (i) A=, or (ii) A 
consists of a single vertex, or (iii) for some x € A® both Ikg(x) and dl4(x) 
are vertex-decomposable. For example, every simplex and simplex-boundary is 
vertcx-decomposable. The class of constructible complexes is defined by: (i) ev- 
ery simplex and @ is constructible, (ii) if A,, 4) and A, A) are constructible and 
dim A, = dim A) = 1 + dim(A, N A), then A, U A) is constructible. 

The following logical implications between these properties of a pure d- 
dimensional complex are strict: 


vertex-decomposable = shellable = constructible 


=> (d — 1)-connected. 


The first implication and the definition of vertex-decomposable complexes are 
due to Provan and Billera (1980). The concept of shellability has an interesting 
history going back to the 19th century, see Griinbaum (1967). Constructible com- 
plexes were defined by M. Hochster, sce Stanley (1977). 

Shellability is usually regarded as a way of putting together (rather than collapsing 
— taking apart) a complex. Therefore the following alternative definition is more 
common: A finite pure d-dimensional complex A is shellable if its d-faces can be or- 
dered oj, 02,...,0; 80 that (60, U---U 80,..,) do, is a pure (d — 1)-dimensional 
complex for 2<k <1, where 50; = 2°/\{@, oj} is the boundary complex of 9;. 
Equivalently, for all 1 <i<k <t there exists j <k such that 0, No, CojN ox 
and dim(9; M o;,) = d — 1. In words, the requirement is that the Ath facet 0; inter- 
sects the union of the preceding ones along a part of its boundary which is a union 
of maximal proper faces of o,. Such an ordering of the facets is called a shelling. 

If 0 € Aand Aisa shellable (or constructible) complex, then so is Ik4(o). Shella- 
bility is also preserved by some other constructions on complexes and posets such 
as Theorem 11.13. Several basic properties of simplicial shellability (also for infi- 
nite complexes) are reviewed in Bjorner (1984b). Shellability of cell complexes is 
discussed in Danaraj and Klee (1974) and Bjérner (1984a); see also chapter 18 by 
Klee and Kleinschmidt. To establish shellability of (order complexes of) posets, 
a special method exists called lexicographic shellability. See Bjérner (1980) and 
Bjérner and Wachs (1983, 1994) for details. The notions of shellability and vertex- 


Topological methods 1855 


decomposability and most of their useful properties can easily be generalized to 
non-pure complexes, sce Bjérner and Wachs (1994). 


11.3. Simplicial PL spheres and PL balls are defined in (12.2), (PL = piecewise 
linear). The property of being PL is a combinatorial property — whether a geometric 
simplicial complex A is PL depends only on the abstract simplicial complex A. 


For showing that specific complexes are homeomorphic to spheres or balls, the 
following result is frequently useful. 


Theorem 11.4. Let A be a constructible d-dimensional simplicial complex. 
(i) If every (d — 1)-face is contained in exactly two d-faces, then A is a PL sphere. 
(ii) If every (d — 1)-face is contained in one or two d-faces, and containment in 
only one d-face occurs, then A is a PL ball. 


Theorem 11.4 follows from some basic PL topology such as the facts quoted in 
(12.2). For shellable A it appears implicitly in Bing (1964) and explicitly in Danaraj 
and Klee (1974). 

If A is a triangulation of the d-sphere (or any manifold) and o € A“, then Ik4() 
has the same homology as the (d -- | ~ k)-sphere. If o € A°, then there is even 
homotopy equivalence between Ik4(a) and S“~'. However, if A is a PL d-sphere 
and a € A‘, then Ik,(c) is itself a PL (d — 1 — k)-sphere. 


Cohen—Macaulay complexes 


11.5. Let k be a field or the ring of integers Z. A finite-dimensional simplicial 
complex A is Cohen—Macaulay over k (written CM/k or CM if & is understood or 
irrelevant) if Ik4(o) is (dimIk,4(o) — 1)-acyclic over k for all o € AU {0}. Further, 
A is homotopy-Cohen—Macaulay if Ik4(o) is (dimlk 4(o) — 1)-connected for all o € 
Au {}. 


The following implications are strict: 


constructible => homotopy -CM => CM/Z -=> CM/k => CM/Q, 


for an arbitrary field &k. Furthermore, CM/Z <=> CM/Z, for all prime numbers 
p. The first implication follows from the fact that constructibility implies (d@ — 1)- 
connectivity and is inherited by links, the second implication follows from (9.15), 
and the rest via the Universal Coefficient Theorem. In particular, shellable com- 
plexes are homotopy-CM. 

An important aspect of finite CM-complexes A is that they have an equivalent 
ring-theoretic definition. Suppose that A° = {x,,x2,...,Xn}, and consider the ideal 
T in the polynomial ring k[x},x2,...,Xn] generated by monomials x;,x;,...x;, such 
that {x;,,x;,---,%,} GALS i <ig <--- <i Sn,k > 1. Let k[A] = khxy,..., xn] / 
I, called the Stanley—Reisner ring (or face ring) of A. Then A is CM/k iff the ring 
k{A] is Cohen-Macaulay in the sense of commutative algebra (Reisner 1976). An 
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exposition of the ring-theoretic aspects of simplicial complexes, and their com- 
binatorial use, can be found in Stanley (1983a). There other ring-theoretically 
motivated classcs of complexes, such as Gorenstein complexes and Buchsbaum 
complexes, are also discussed. Other approaches to the ring-theoretic aspects of 
complexes and to Reisner’s theorem can be found in Baclawski and Garsia (1981) 
and Yuzvinsky (1987). Sec also section 5 of chapter 41 on Combinatorics in Pure 
Mathematics. 

Cohen—Macaulay complexes and posets were introduced around 1974-75 in the 
work of Baclawski (1976, 1980), Hochster (1977), Reisner (1976) and Stanley (1975, 
1977). The notion of homotopy-CM first appeared in Quillen (1978). Bjérner, Gar- 
sia and Stanley (1982) give an elementary introduction to CM posets. A notable 
combinatorial application of Cohen—Macaulayness is Stanley’s proof of tight upper 
bounds for the number of faces that can occur in each dimension for triangulations 
with n vertices of the d-sphere (Stanley 1975, 1983a; see also chapter 18 by Klee 
and Kleinschmidt.) An application to lower bounds is given in Stanley (1987a). 


11.6. Define a pure d-dimensional complex A to be strongly connected (or dually 
connected) if each pair of facets 7,7 € A“ can be connected by a sequence of facets 
F = 09, 04,---,0n = T, SO that dim(a;_, MN o;) =d—1 forl <ican, 


Proposition 11.7. Every CM complex is pure and strongly connected. 


This follows from the following lemma, which is proved by induction on dim A: 
Let A be a finite-dimensional simplicial complex, and assume that \k,4(a) is con- 
nected for all 0 € AU {@} such that dim(Ik4(a)) > 1. Then A is pure and strongly 
connected. 

The property of being CM is topologically invariant: whether A is CM/k or not 
depends only on the topology of |{Aj]. This is implied by the following reformulation 
of CM-ness, due to Munkres (1984b). 


Theorem 11.8. A finite-dimensional complex A is CM/k iff its space T = ||Al| sat- 
isfies: H(T ,k) = H(T,T\p,k) =0 for all p € T and i < dim A. 


In this formulation H; denotes reduced singular homology and H; relative sin- 
gular homology with coefficients in k. A consequence of Theorem 11.8 is that 
if M is a triangulable manifold (with or without boundary) and H,(M) =0 for 
i < dimM, then every triangulation of M is CM. For instance: (1) every triangu- 
lation of the d-sphere, d-ball or R¢ is CM/Z, but not necessarily homotopy-CM 
(beware: homotopy-CM is not topologically invariant), (2) a triangulation of real 
projective d-space is CM/k iff char k # 2. 


11.9. The definition of Cohen—-Macaulay posets (posets P such that A(P) is CM) 
deserves a small additional comment. Let P be a poset of finite rank and oa: x9 < 
Xp Ses Sy a chain in P. Then Ikapy(o) = Pex * (x0, *1) Ree Kk (Xp_1) Xk) * Pysx,. 
It therefore follows from (9.20) that P is CM [resp. homotopy-CM] iff every open 
interval (x,y) in P is (rank(x, y) — 1)-acyclic (resp. (rank(x, y) — 1)-connected]. 


Topological methods 1857 


Some uses of Cohen—Macaulay posets in commutative algebra are discussed in 
section 5 of chapter 41 on Combinatorics in Pure Mathematics. 


11.10. An abundance of shcllable and CM simplicial complexes appear in combi- 
natorics. Only a few important examples can be mentioned here. ; 

(i) The boundary complex of a simplicial convex polytope is shellabic (Brugges- 
ser and Mani 1971, Danaraj and Klee 1974; see also chapter 18 by Klee and Klein- 
schmidt). Every simplicial PL sphere is the boundary of a shellable ball (Pachner 
1986). There exist non-shellable triangulations of the 3-ball (M.E. Rudin) and of 
the 3-sphere (see below). Shellability of spheres and balls is surveyed in Danaraj 
and Klce (1978). 

(ii) The following implications are valid for any simplicial sphere: constructible 
=> PL => homotopy-CM. The 5-sphere admits triangulations that are non- 
homotopy-CM (R.D. Edwards, see Daverman 1986), and also PL triangulations 
that are non-constructihle (Mandel 1982). Every triangulation of the 3-sphere is 
PL, but all are not shellable (Lickorish 1991, see also Vince 1985). Face lattices of 
regular complex polytopes are CM (Orlik 1990). 

(iii) The complex of independent sets in a matroid is constructible (Stanley 
1977) and vertex-decomposable (Provan and Billera 1980). More generally, the 
complex generated by the basis-complements of a greedoid is vertex-decomposable 
(Bjorner, Korte and Lovasz 1985). Complexes arising from matroids are discussed 
in Bjérner (1992). 

(iv) Every semimodular (in particular, every geometric or modular) lattice of 
finite rank is CM (Folkman 1966) and shelable (Bjérncr 1980). For any element 
x £0 in a geometric lattice L, the poset L\[x, i] is shellable (Wachs and Walker 
1986). 

(v) Tits buildings are CM (Solomon-Tits, see Brown 1989 or Ronan 1989) and 
shellable (Bj6rner 1984b). The topology of more general group-related geometries 
has been studied by Ronan (1981), Smith (1988), Tits (1981) and others with a 
view to uses in group theory. See Buekenhout (1995) and Ronan (1989) for general 
accounts. 

(vi) The poset of elementary Abelian p-subgroups of a finite group was shown 
by Quillen (1978) to be homotopy-CM in some cases. See also Stong (1984). The 
full subgroup lattice of a finite group G is shellable (or CM) iff G is supersolvable 
(Bjérner 1980). Various posets of subgroups have been studied from a topological 
point of view. See Thévenaz (1987), Webb (1987) and Welker (1994) for a guide 
to this literature. 


Induced subcomplexes 


Connectivity, Cohen—Macaulayness, etc., are under certain circumstances inherited 
by suitable subcomplexes. For a simplicial complex A and A C A®, let A, = {a € 
A|o CA} (the induced subcomplex on A). 


Lemma 11.11. Let A be a finite-dimensional complex, and A C V = A°. Assume that 
Ik,(a) is k-connected for all o € Ay\a. Then A, is k-connected iff A is k-connected. 
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Lemma 11.12, Let P be a poset of finite rank and A a subset. Assume that P., is 
k-connected for all x € P\A. Then A is k-connected iff P is k-connected. 


Proof. These lemmas are equivalent. We start with Lemma 11.12. Let f: A — P 
be the embedding map. For x € P, 


A: ifxe A 
~| P J= ZX ’ 
Fo (P2y) Px,NA, ifx GA. 


Now, A>, is contractible (being a cone), and P..,NA is k-connected by induction 
on rank(P). The result therefore follows by Theorem 10.5 (ii). 
To prove Lemma 11.11, let P = P(A) and Q= {rE A|TNA FO} CP. Since 
> = P(Ik4(o)) is k-connected for all o € P\Q, Lemma 11.12 applies. On the 
other hand, by Corollary 10.12 the map f(r) =7NA on Q induces homotopy 
equivalence between Q and f(Q) = P(A,). O 


The homology versions of Lemmas 11.11 and 11.12, obtained by using k- 
acyclicity throughout, can be proven by a parallel method. Also, if the hypothesis 
“k-connected” were replaced by “contractible” in these lemmas, then the conclu- 
sion would be that 4, and A (resp. A and P) are homotopy equivalent. 


Theorem 11.13. Let 4 be a pure d-dimensional simplicial complex, A C A° and 
' 1<m<d. Suppose that card(ANa) =m for every facet ao € A“. If A is CM/k, 
homotopy-CM or shellable, then the same property is inherited by Aj. 


For CM-ness this result was proven in varying degrees of generality by Baclawski 
(1980), Munkres (1984b), Stanley (1979) and Walker (1981a). It follows easily from 
Lemma 11.11. For shellability, proofs appear in Bjorner (1980, 1984b). 

Suppose that A is a pure d-dimensional simplicial complex and that there exists 
a mapping 1: A® > {0,1,...,d} which restricts to a bijection on each facet o € A“. 
Then A is called completely balanced (or numbered, or colored) with type-map ¢. 
For instance, the order complex of a pure poset is completely balanced with type- 
map t = rank [cf. (9.2)], and also building-like incidence geometries (Buekenhout 
1995) give rise to completely balanced complexes. CM complexes of this kind were 
studied by Stanley (1979) and others. 

For each J C {0,1,...,d}, the type-selected subcomplex Ay, = 4,-1.4) is the in- 
duced subcomplex on ¢~'(J) C 4°. Theorem 11.13 shows that if 4 is CM then A,,, 
is also CM and hence (card J — 2)-acyclic. A certain converse is also true in the 
sense of the following result, which gives an alternative characterization of the 
CM property for completely balanced complexes. It is due to Bactawski and Gar- 
sia (1981) in the finite CM case, and to J. Walker (letter to the author, 1981) in 
gencral including the homotopy case. 


- Theorem 11.14. Let A be a pure d-dimensional completely balanced complex. Then 
A is CM/k [resp., homotopy-CM| if and only if Ay, is (card J —2)-acyclic over k 
{resp., (card J — 2)-connected] for all J C {0,1,...,d}. 
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12. Cell complexes 


Most classes of cell complexes differ from the simplicial case in that a purely 
combinatorial description of. these objects as such cannot be given. However, the 
two classes defined here, polyhedral complexes and regular CW complexes, are 
sufficiently close to the simplicial case to allow a similar combinatorial approach 
in many cases. For simplicity only finite complexes will be considered. 

Good general references for polyhedral complexes are Griinbaum (1967) and 
Hudson (1969), and for cell complexes Cooke and Finney (1967) and Lundell and 
Weingram (1969). Cell complexes are also discussed in many books on algebraic 
topology such as Munkres (1984a) and Spanier (1966). 


Polyhedral complexes and PL topology 


12.1. A convex polytope 7 is a bounded subset of R“ which is the solution set of a 
finite number of linear equalities and inequalities. Any nonempty subset obtained 
by changing some of the inequalities to equalities is a face of a. Equivalently, 
a C R¢ is a convex polytope iff m is the convex hull of a finite set of points in 
R“. See chapter 18 by Klee and Kleinschmidt for more information about convex 
polytopes. 


A polyhedral complex (or convex cell complex) T is a finite collection of convex 
polytopes in R@ such that (i) if @¢ F and o is a face of @ then o € I, and (ii) 
if 7,7 Ef and wO7 4 then wz is a face of both w and t. The members 
of I are called cells. The underlying space of I is ||| =U, with the topology 
induced as a subset of R¢. If every cell in F is a simplex (the convex hull of 
an affinely independent set of points) then I is called a (geometric) simplicial 
complex. The dimension of a cell equals the linear dimension of its affine span, and 
dim F = max,,<;dim a. Further terminology, such as vertices, edges, facets, pure, k- 
skeleton, face poset, face lattice, etc., is defined just as in the simplicial case, see 
(9.1) and (9.3). 


12.2. A polyhedral complex I, is a subdivision of another such complex I) if 
Fi 1| = |[24|| and every cell of IP, is a subset of some cell of £;. The abstract simpli- 
cial complex A(P(I)), i.e., the order complex of I's face poset, has geometric real- 
izations (by choosing as new vertices an interior point in each cell) that subdivide 
I’. Every polyhedral complex can be simplicially subdivided without introducing 
new vertices. 


Let =“ denote the complex consisting of a geometric d-simplex and all its faces, 
and Ict 62“ denote its boundary. These complexes provide the simplest triangula- 
tions of the d-ball and the (d — 1)-sphere, respectively. A polyhedral complex I" is 
called a PL d-ball (or PL (d — 1)-sphere) if it admits a simplicial subdivision whose 
face poset is isomorphic to the face poset of some subdivision of X“ (resp. 5D“). 
This is equivalent to saying that there exists a homeomorphism ||| — ||“|| (resp. 
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| Fl] + 8%41]) which is induced by a simplicial map defined on some subdivision 
(a piecewise linear, or PL, map). The boundary complex of a convex d-polytope is 
a PL (d — 1)-sphere. 

The PL property is mainly of technical interest. Several properties of balls and 
spheres that are desirable, and would in many cases seem intuitively “obvious”, 
hold only in the PL case. Some examples are: (1) (Newman’s Theorem) the closure 
of the complement of a PL d-ball lying in a PL d-sphere is itself a PL d-ball; (2) 
the union of two PL d-balls, whose intersection is a PL (d —1)-ball lying in the 
boundary of each, is a PL d-ball; (3) the link of any face in a PL sphere is itself 
a PL sphere (cf. remark following Theorem 11.4). All these statements would be 
false with “PL” removed. 

See Hudson (1969) for proofs and further information about PL topology. Man- 
del (1982) develops basic PL topology from a combinatorial perspective. 


Regular cell complexes 


12.3. By “cell complex” we will here understand what in topology is usually called 
a “finite CW complex”. 


Let X be a Hausdorff space. A subset o is called an open d-cell if there exists 
a mapping f : B“ —» X whose restriction to the interior of the d-ball is a homeo- 
morphism f : Int(B“) — 0. The dimension dima = d is well-defined by this. The 
closure & is the corresponding closed cell. It is trae that f(B“) = &, but & is not 
necessarily homeomorphic to B“. We write 6 = &\ca. 

A cell complex € is a finite collection of pairwise disjoint sets together with a 
Hausdorff topology on their union ||€@|| =U @ such that: 

(i) each o € © is an open cell in |||], and 

(ii) é C 64" (the union of all cells in € of dimension less than dimo), for 
alloc &. 

Then © is also called a cell decomposition of the space |||. Furthermore, € 
is regular if each mapping f : B — |||] defining the cells can be chosen to be a 
homeomorphism on all of B“. Then, of course, every closed cell & is homeomorphic 
to a ball. (However, it is not enough for the definition of a regular complex to only 
require that every closed cell is homeomorphic to a ball. The smallest example 
showing this has three vertices, three edges and one 2-cell.) 

The cell decomposition of the d-sphere into one 0-cell and one d-cell (a point 
and its complement in S$“) is not regular. Every polyhedral complex is a regular 
cell complex (the relative interiors of the convex polytopes are the open cells). 
Regular cell complexes are more general than polyhedral complexes in several 


ways. For instance, it is allowed that the intersection of two closed cells can have 
nontrivial topological structure. 


12.4, From now on only regular cell complexes will be considered. Define the face 
poset P(©) as the set of all closed cells ordered by containment.The following two 
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Figure 2. 


particular properties make a regular complex € favorable from a combinatorial 
point of view (see Cooke and Finney 1967 or Lundell and Weingram 1969 for 
proofs): 

(i) The boundary & of each cell a € € is a union of cells {a subcomplex). Hence, 
the situation resembles that of polyhedral complexes: each closed d-cell a is home- 
omorphic to B“, and its boundary & (homeomorphic to S“~') has a regular cell 
decomposition provided by the cells that intersect 6. 

(ii) [[@l] = ACP (€))]|, i-e., the order complex of P(€) is homeomorphic to ||€\j. 
Geometrically this means that regular cell complexes admit “barycentric subdivi- 
Sions”. From a combinatorial point of view it means that regular cell complexes 
can be interpreted as a class of posets without any loss of topological information. 


Because of (i), regular cell complexes can be characterized in the following way: 
A family of balls (homeomorphs of B“,d > 0) in a Hausdorff space X is the set of 
closed cells of a regular cell complex iff the interiors of the balls partition X and 
the boundary of each ball is a union of other balls. This is what Mandel (1982) 
calls a “ball complex”. 

An important consequence of (ii) is that a d-dimensional regular cell complex 
€ can always be “realized” in R24"! by a simplicial complex, so that every closed 
cell in © is a triangulated ball (a cone over a simplicial sphere). 

For a detailed discussion of regular cell complexes from a combinatorial point of 
view, see section 4.7 of Bjérner et al. (1993). Figure 2 shows a regular cell decom- 
Position @ of the 2-sphere, its face poset P(), and its simplicial representation 
A(P(%)), where each original 2-cell is triangulated into four triangles. 
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12.5. Given a finite poset P, does there exist a regular cell complex (or even a 
polyhedral complex) € such that P=P(); and if so, what is its topology and how 
can © be constructed from P? This question is discussed in Bjérner (1984a) and 
Mandel (1982) from different perspectives. One answer is that P is isomorphic 
to the face poset of some regular cell complex iff A(P-.) is homeomorphic to a 
sphere for all x € ?. However, since it is known that simplicial spheres cannot be 
recognized algorithmically this is not a fully satisfactory answer. The question of 
how to recognize the face posets of polyhedral complexes is one version of the 
Steinitz problem (see chapter 18 by Klee and Kleinschmidt). 


For the cellular interpretation of posets the following result, derivable from 
Theorem 11.4, has proven useful in practice. See Bjdrner (£984a) for further details. 
Let us call a poset P thin if every closed interval of rank 2 has four elements (two 
“in the middle”). Also, P U {0} will denote P with a new minimum element 0 
adjoincd, and P = Pu {0,1} as usual. 


Theorem 12.6. Let P be a pure finite poset of rank d. Assume that A(P) is con- 
structible. 

(If PU {0} is thin, then P = P() for some regular cell complex € homotopy 
equivalent to a wedge of d-spheres. 

(ii) If P is thin, then P ~ P(€) for some regular cell decomposition of the d- 
sphere. 


13. Fixed-point and antipodality theorems 


The topological fixed-point and antipodality theorems of greatest use for combina- 
torics will be reviewed. We start by stating four equivalent versions of the oldest of 
them: Brouwer’s fixed-point theorem (from 1912). Proofs and references to origi- 
nal sources for all otherwise unreferenced material in this section can be found in 
many topology books, e.g., in Dugundji and Granas (1982). Recall that mappings 
between topological spaces are always assumed to be continuous. 


Theorem 13.1 (Brouwer’s Theorem). (i) Every mapping f:B’ — B4 has a fixed 
point x = f(x). 

(ii) St is not a retract of B® (i.e, no mapping B’ — S*” leaves each point of 
S*" fixed). 

(iii) $4"! is not (d — 1)-connected. 

(iv) S“"' is not contractible. 


Brouwer’s Theorem is implied by the following combinatorial lemma of Sperner 
(1928), see also Cohen (1967): If the vertices of a triangulation of S4—' are colored 
with d colors, then there cannot be exactly one (d —1)-face whose vertices use all 
d colors. Sperner’s Lemma was generalized by Lovasz (1980): If the vertices of a 
(d — 1)-dimensional manifold are labeled by elements from some rank-d loopless 
matroid, then there cannot be exactly one (d —1)-face whose vertices form a basis 
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of the matroid. A further generalization and an application to hypergraphs appear 
in Lindstr6m (1981). Sperner’s Lemma is of practical use. for the design of fixed- 
point-finding algorithms in connection with applications. of Brouwer’s Theorem, 
see Todd (1976). , ; 

It is well known that Brouwer’s Theorem for d = 2 implies that there is no draw 
in the 2-person game HEX. Actually the implication goes the other way as well. 
Gale (1979) defines a d-person d-dimensional HEX game, and proves that for 
each d > 2 the Brouwer Theorem 13.1 is equivalent to the impossibility of a draw 
in d-dimensional HEX. 

We turn next to the (Hopf-)Lefschetz fixed-point theorem (from 1927-28), which 
gives a vast generalization of Theorem 13.1. Lefschetz’ Theorem and the closely 
related trace formula of Hopf will be stated in simplicial versions. 

Let A be a nonempty simplicial complex and f: |[A[] — ||Al] a continuous map. 
The Lefschetz number A(f) is defined by A(f) =>°;,9(—1)' trace (f*), where 
f; : H(A, Q) — H,{A,Q) is the induced mapping on i-dimensional reduced ho- 
mology. (We use Q-coefficients throughout here for simplicity, other fields may 
of course be used instead.) Note that f ~ g implies A(f) = A(g) (since homotopic 
maps induce identical maps on homology), in particular if f is null-homotopic 
(meaning homotopic to a constant map) then A(f) = 0. Also, if A is Q-acylic then 
A(f) = 0 for all self-maps f. 

Now, suppose that f:A— A is simplicial, and say that a face 7 € A is fixed 
if f(t)=7 as a set. Let of (f) {resp. p, (f)be the number of fixed i-faces 
whose orientation is preserved [resp. reversed]. Here we consider the orienta- 
tion of 7 = {x0,.x),...,4;} to be preserved if the permutations xy,x),...,x; and 
f (x0), f(x), -.-, f(x) have the same parity. The following is a special case of the 
Hopf trace formula: 


A(f) +1 = S0(-1)' le — @ A. (13.2) 


i20 
Notice that for f = id formula (13.2) specializes to the Euler-Poincaré formula 
(9.13). 
One sees from (13.2) that if f has no fixed face, then A(f) = —1. Using simplicial 
approximation and compactness the following is deduced. 


Theorem 13.3 (Lefschetz’s Theorem). Jf f:||Al] — ||Al] is @ mapping such that 
A(f) # -1, then f has a fixed point. 


The following two consequences of Theorem 13.3 generalize Brouwer’s Theorem 
in different directions. 


Corollary 13.4. Let T be a compact triangulable space. 
(a) Every null-homotopic self-map of T has a fixed point. 
(b) If T is Q-acyclic, then every self-map of T has a fixed point. 


The following consequence of the Hopf trace formula is useful in some combi- 
natorial situations. Let once more f: 4 — A be a simplicial mapping of a simplicial 
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complex A. Assume that a face 7 € A is fixed if and only if 7 is point-wise fixed [i.e., 
f(r) = 7 implies f(x) = x for all x € 7]. One may then define the fixed subcomplex 
Al = {1 € A| f(r) = 7}, which coincides with the induced subcomplex on the set 
of fixed vertices, and (13.2) specializes to 


A(f) = x(4’). (13.5) 


One situation where this is used (see, e.g., Curtis, Lehrer and Tits 1980) is in con- 
nection with groups acting on finite complexes, where (13.5) says that the “Lef- 
schetz character” has a topological interpretation as the reduced Euler character- 
istic of the fixed subcomplex. Another such situation (see Baclawski and Bjérner 
1979 and section 3 of this chapter) is when f: P — P is an order-preserving poset 
map, in which case (13.5) can be rewritten A(f) = (P/), the right-hand side de- 
noting the value of the Mébius function computed over the subposet of fixed points 
augmented with a new 0 and i (cf. (9.14)]. 

The following definitions will now be needed. Let p be a prime. By a Z,-space 
we understand a pair (T,v) where T is a topological space and » : T — T is a fixed- 
point free continuous mapping of order p (i-e., v? =id). A mapping f: T; — Tz 
of Z,-spaces (T;,¥;),i= 1,2, is equivariant if v,°of = fov,. A Z,-space is often 
called an antipodality space. The standard example is (S“, a), the d-sphere with its 
antipodal map a(x) = —x. 

We state five equivalent versions of the antipodality theorem of Borsuk (1933). 


Theorem 13.6 (Borsuk’s Theorem). 


(i) If S“ is covered by d+ 1 subsets, all closed or all open, then one of these must 

contain a pair of antipodal points. (Borsuk—Liusternik-Schnirelman) 

(ii) For every continuous mapping f :§¢ — R? there exists a point x such that 
f(x) = f(—x). (Borsuk—Ulam) 

(iii) For every odd [f(—y) = —f(y) for all y] continuous mapping f :S¢ — R4 
there exists x for which f(x) = 0. (Borsuk—Ulam) 

(iv) There exists no equivariant map S" — §", if n> d. 

(v) For any d-connected antipodality space T, there exists no equivariant map 
ToS? 


Borsuk’s Theorem is implied by a certain combinatorial lemma of A.W. Tucker, 
much like Brouwer’s Theorem is implied by Sperner’s Lemma. See Freund and 
Todd (1981) for a statement and proof of Tucker’s Lemma and further references. 
In Theorem 13.6 (v) it suffices to assume that T is d-acyclic over Z,, see Walker 
(1983b). 

Steinlein (1985) gives an extensive survey of generalizations, applications and 
references related to Borsuk’s Theorem. Applications to combinatorics are sur- 
veyed by Alon (1988), Bardny (1993) and Bogatyi (1986); see also sections 4 and 
5 of this chapter. 

The following extension of the Borsuk-Ulam Theorem appears in Yang (1955): 
For every mapping S“" —+ R4 there exist n mutually orthogonal diameters whose 2n 
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endpoints are mapped to the same point. The same paper also gives references to 
the following related theorem of Kakutani-Yamabc-Yujob6: For every mapping 
S” > R there exist (n+1) mutually orthogonal radii whose (n+1) endpoints are 
mapped to the same point. An interesting consequence of the last result is that every 
compact convex body K Cc R”*! is contained in an (n+ 1)-cube C such that every 
maximal face of C touches K [for each x € $” let f(x) be the minimal distance 
between two parallel hyperplanes orthogonal to the vector x and containing K 
between them]. 

Suppose E, and E, are two bounded and measurable subsets of R?. Identify 
R? with the affine plane A = {(&,7,1)} in R’,.and for each x € S* let f;(x) be 
the measure of that part of E; which lics on the same side as x of the plane H, 
through the origin orthogonal to x, for i = 1,2. The Borsuk—-Ulam Theorem implies 
that f,(x) = fi(—x) and f(x) = fo(—x) for some x € S$”, which means that the line 
ANH, bisects both FE, and £2. This “ham sandwich” argument generalizes to 
arbitrary dimensions and leads to the following consequence of the Borsuk—Ulam 
Theorem. 


Corollary 13.7 (“Ham Sandwich Theorem”). Given d bounded and Lebesgue mea- 


surable sets in R" there exists some affine hyperplane that simultaneously bisects them 
all. 


Also Corollary 13.7 has several generalizations and related results. The case 
when k < d bounded and measurable sets are given is covered by the following 
result of Zivaljevié and Vre¢cica (1990): Let p14, 12,...,4% be a collection of o- 
additive probability measures defined on the o-algebra of all Borel sets in R4,1 < 
k <d. Then there exists a (k —1)-dimensional affine subspace A C R* such that 
for every closed halfspace H © R4 and every i =1,2,...,k, A CH implies w;(H) > 
1/(d —k +2). For k =d this specializes to a measure-theoretic version of the Ham 
Sandwich Theorem (see also Hill 1988), and for k = 1 it gives a theorem of Rado 
(1946) which says that for any measurable E C R¢ there exists a point x € R@ such 
that every halfspace containing x contains at least a 1/(d + 1)-fraction of E. 

We end by stating a useful generalization of the Borsuk-Ulam Theorem to 
@,-spaces for p> 2. First a few definitions, see Bardny et al. (1981) for com- 
plete details. Let p be a prime and n 2 1. Take p disjoint copies of the n(p — 1)- 
dimensional ball and identify their boundaries. Call this space X,,». There ex- 
ists a mapping v :S”"”~")-! _, §"- 0~' of the identified boundary which makes it 
into a Z,-space. Extend this mapping to X,,p as follows. If (y,7,q) denotes the 
point of X,,p from the gth ball with radius r and $"” ") '-coordinate y, then put 
v(y,r,q) = (vy,r,q+1), where g +1 is reduced modulo p. This mapping v makes 
Xnp into a Z,- space. [Note that (X,,2,v) & (S", a).] 


Theorem 13.8 (Bardny, Shlosman and Sztics 1981). For every continuous mapping 
f: Xnp > R" there exists a point x such that f(x) = f(vx) =---=f(v? 'x). 


Some applications of Theorem 13.8 are mentioned in sections 4 and 5. 
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This is a collection of examples of the use of combinatorial techniques in practical 
decision situations. The emphasis is on the description of real-world problems, the 
formulation of mathematical models, and the development of algorithms for their 
solution. We survey related models and applications. 


Introduction 


The spectacular growth of combinatorics over the past few decades is to some 
extent due to the diversity and the importance of its applications. Combinatorial 
problems occur in other branches of mathematics and in computer science, in the 
natural sciences and in the humanities, and in all kinds of practical decision 
situations. In addition, the solution of these combinatorial problems has often 
yielded significant advances and benefits. There is little doubt that its successful 
use in other fields has greatly stimulated research in the area of combinatorics. 

The sample of applications of combinatorics presented in this chapter all arise 
in situations of planning and design that are usually dealt with in operations 
research. This discipline is concerned with the investigation of models and 
methods that support decision making in practice. We do not intend to give a 
complete survey. Instead, we have tried to select some typical examples with the 
hope of conveying the flavor of the subject. Our selection process has been guided 
by the following biases and principles. 

First, it has been our purpose to show how to solve real-world problems, not 
how to construct applications of combinatorial models. With one or two excep- 
tions, each of our examples finds its origin in practice. At the same time, we have 
preferred those problems that give rise to clean and elegant models, and we have 
avoided complications that in the present context would only obscure the essence. 

Further, most of the problems we discuss occurred in the Netherlands. While 
this emphasis reftects the limitations of our experience, we do not feel that it has 
narrowed the scope of our examples. 

Finally, we will concentrate on the design and analysis of models and 
algorithms. Collecting data, writing computer codes, building user interfaces, and 
getting solutions implemented are equally important stages in the practice of 
combinatorics. They do not belong, however, to the subject matter of this 
chapter. 

Each of the sections below follows the same outline: we describe a practical 
problem, formulate one or more mathematical models, present suitable solution 
methods, and survey related models and applications. It is assumed that the 
reader is familiar with the fundamentals of combinatorics and combinatorial 
optimization. We refer, in particular, to chapter 2 on connectivity and network 
flows, chapter 3 on matchings, chapter 4 on coloring, stable sets and perfect 
graphs, chapter 28 on optimization, and chapter 30 on polyhedral combina- 
torics. 
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1. Traveling salesman 


1.1. X-rays and arrays 


To demonstrate the versatility of the traveling salesman problem as a model, we 
will consider two very different problems situations. 

The first situation involves the sequencing of X-ray measurements in crys- 
tallography. One wishes to analyze the detailed structure of a crystal. To this end, 
the crystal is mounted in a diffractometer, and the intensity of X-rays is measured 
for a large number of positions of the crystal and the reading device inside the 
apparatus. Such an experiment may require many thousands of readings. These 
readings can be made relatively quickly, but the repositioning time between 
successive measurements is substantial. The readings can be taken in any order, 
and the question is how one should sequence them so as to minimize the time to 
complete the experiment. 

Bland and Shallcross (1989) encountered problems of this type at the Cornell 
High Energy Synchrotron Source. For experiments with up to 14 464 readings, 
they computed sequences for which the total repositioning time is within 1.7% of 
a lower bound on the optimum. The standard method used for sequencing the 
measurements produced solutions that are generally between 55% and 90% 
above the optimum. 

The second situation concerns the clustering of a data array. Given are two 
finite sets R and S and a nonnegative matrix (4,,),cr.c5, Where a,, measures the 
strength of the relationship between elements r € R and s € S. One would like to 
permute the rows and columns of the matrix so as to bring its large elements 
together. The resulting clustering should identify strong relationships between 
subsets of R and S. 

McCormick et al. (1972) argue that clustering a matrix may be useful for 
problem decomposition and data reorganization. They illustrate this with three 
examples. The first one arises in airport design. R (=S) is a set of 27 facilities that 
should be available at the airport and that are under the control of the designer; 
a,, is fixed at 0, 1, 2 or 3 depending on whether facilities r and s have no, a weak, 
a moderate or a strong interdependence. The permuted matrix should suggest a 
decomposition of the design problem into subproblems that interact not at all or 
only in a limited and well-defined way. The second example involves a set R of 53 
aircraft types and a set S of 37 functions they can perform; a,, = 1 if aircraft r is 
suitable for function s, and a,, = 0 otherwise. The rearranged matrix shows which 
aircraft are able to perform the same functions and which tasks can be performed 
by the same aircraft. The third example also deals with an object—attribute array. 
R is a set of 24 marketing techniques, S is a set of 17 marketing applications, 
a,,=1 if technique r has been successfully used for application s, and a,, =0 
otherwise. Lenstra and Rinnooy Kan (1975) give a fourth example. It deals with 
an input-output matrix. R (=S) is a set of 50 regions on the Indonesian islands, 
a,, = 1 if at least 50 tons of rice are annually transported from region r to regions, 
and a,, =0 otherwise. 
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1.2. Model formulation 


Both problems can be modeled as a traveling salesman problem. This is the 
problem of a salesman who, starting from his home city, has to find the shortest 
tour that takes him exactly once through each of a number of other cities and then 
back home. Suppose there are a citics and ¢,, is the distance betwecn cities i and j 
(i, j=1,...,n). The salesman is interested in a permutation a of {1,...,n} 
that minimizes. ; 

n-1 


c 


“niymis ay TC 


mryath) © 
Herc, 7(i) is the ith city visited. The traveling salesman problem is symmetric if 
cj, = ¢;; for all i, j. 

It is straightforward to cast the sequencing probicm in these terms. We identify 
the readings with the cities and the repositioning time between two readings with 
the distance between the corresponding cities. We then add one more city with 
equal distances to the others, in order to transform the problem of finding an 
open sequence into that of finding a closed tour. Note that the distances are 
symmetric. 

As to the clustering problem, we first have to convert it into an optimization 
problem. McCormick et al. (1972) propose to measure the effectiveness of a 
clustering by the sum of all products of horizontally or vertically adjacent 
elements. The reader can easily convince himself that higher sums of these 
products tend to correspond to better clusterings. The problem is now to permute 
the rows and columns of the matrix so as to maximize this criterion. 

Permuting the rows does not affect the horizontal adjacencies of the elements, 
and permuting the columns does not affect their vertical adjacencies. The 
problem therefore decomposes into two separate and similar problems, one for 
the rows and one for the columns. We consider the former. The row optimization 
problem is to find a permutation p of R that maximizes 


IR\-1 


> > Qos to taps > 

ro ses 
Here, row p(r) of the matrix is put in position r. This is, again, nothing but the 
symmetric traveling salesman problem in disguise (Lenstra 1974). Let R= 
(1,..-,[R]}, and define 


n=(|R|+1, 


Cy =~ y G4 js 5 Cin = Cnj =O fori, fjER. 
ses 


The rows of the matrix are the cities, the additive inverses of their inner products 
are the distances, and a dummy city has been added to close the tour. 

The symmetric traveling salesman problem is convenicntly formulated in terms 
of undirected graphs. Consider the complete graph K, =(V,E) on n nodes, 
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where a weight c, is associated with cach cdge e € F. The problem is to find a 
Hamiltonian circuit or tour in K,, t.e., a circuit that visits each node exactly once, 
of minimum total weight. This generalizes the problem of determining whether a 
given graph contains a Hamiltonian circuit, which is NP-complete. 

An integer programming formulation is as follows. Let x, = 1 indicate that the 
salesman travels along the edge e. The problem is then to minimize 


Dee: 


cEk 


subject to 


> x,=2 foralieV, (1.1) 
e&h(i) 
DS x,=2 foralUCV, U4H, UV, (1.2) 
cE) 
x,€ {0,1} foralleEe. (1.3) 


Here, 5(U) is the set of edges with exactly one end in U; we write S(/) for 5({i}). 
Without the constraints (1.2), this is the 2-matching problem; each node has 
degree 2, but the seiected edges do not necessarily form a single tour. The 
constraints (1.2) eliminate subtours on any proper subset U CV. 


1.3. Solution approaches 


Many types of approximation algorithms have been developed for the symmetric 
traveling salesman probicm. It is useful to distinguish between constructive 
methods, which build a single tour, and iterative improvement methods, which 
search the neighborhood of the current solution for a better one and continue the 
search until a local optimum has been obtained. An example of a constructive 
method is the nearest neighbor rule. the salesman starts in a given city and always 
travels to an unvisited city that is closest to the last chosen city. One of the 
earliest iterative improvement methods was proposed by Lin (1965). He defines 
the k-exchange neighborhood of a tour as the set of all tours that can be obtained 
from it by replacing any set of k edges by another set of kK edges, and he calls a 
tour that is locally optimal with respect to this neighborhood k-opt. The values 
k =2 and k =3 are most often used. The champion among heuristics for the 
symmetric traveling salesman problem is the variable-depth search method of Lin 
and Kernighan (1973), where the value of & is not specified in advance. 

In order to enhance the computational efficiency of edge exchange procedures 
on large problem instances, one usually replaces K, by a sparse subgraph. Bland 
and Shallcross (1989), for example, included for each node only the ten shortest 
incident edges. Their upper bounds were computed by the Lin—Kernighan 
algorithm, and their lower bounds by the Held—Karp algorithm mentioned below. 

Optimization algorithms for the symmetric traveling salesman problem usually 
proceed by branch and bound, with lower bounds based on spanning 1-trecs 
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combined with Lagrangean relaxation, or on fractional 2-matchings combined 
with cutting planes. 

A spanning |-tree consists of a spanning tree on the node.sct V\{1} and two 
edges incident to node 1. A minimum-weight spanning |-tree can be found in 
polynomial time. In comparison with a tour, it is still a connected graph with n 
edges, but the requirement that all node degrees should be equal to 2 has been 
relaxed. Its weight is therefore a lower bound on the length of a shortest tour. 
The lower bound may be improved by calculating a penalty d, for each node i 
(positive if the degree of i is larger than 2, negative if it is smaller) and to replace 
the weight c, by c,+d,+d, for cach edge e = {i, j}. The ordering of tours 
according to length is invariant under this transformation, but the optimal 
spanning 1-tree may change. The approach is due to Held and Karp (1970, 1971) 
and signified the beginning of the use of Lagrangean relaxation in combinatorial 
optimization. The penaltics or Lagrangean multipliers d, are calculated by general 
subgradient optimization techniques or by special multiplier adjustment schemes. 

A fractional 2-matching is a feasible solution to (1.1) and 0 <x, <1 (e€ E). In 
comparison with a tour, the subtour elimination constraints (1.2) and the 
integrality requirements in (1.3) have been relaxed. An optimal fractional 2- 
matching can be obtained in polynomial time, e.g., by linear programming. The 
resulting lower bound can be improved by the addition of cutting planes that 
correspond to facets of the symmetric traveling salesman polytope, t.e., the 
convex hull of all solutions to (1.1)-(1.3). The subtour elimination constraints 
(1.2) define facets, but many more classes of facets have been identified. Given 
the solution corresponding to such a tower bound, one trics to find a violated 
facet and adds the corresponding constraint to the linear program. If the sequence 
of linear programs does not yicld a feasible solution to (1.1)—(1.3) (and hence an 
optimal tour), then some form of tree search is applied. There are, in general, 
two difficulties with this polyhedral approach. First, it is very unlikely that one 
will ever be able to characterize all of the facets. Secondly, the so-called 
separation problem of finding a facet that is violated by the solution to the linear 
program is often far from trivial; usually, fast heuristics are used. However, 
Padberg and Rinaldi (1987) and Grétschel and Holland (1991) have obtained 
impressive computational results with this approach. 

For more detailed information on branch and bound, Lagrangean relaxation, 
polyhedral techniques, and the relation between the optimization problem and 
the separation problem, sce chapters 28 and 30. 


1.4. Related models and applications 


As has already been observed, the problem of determining whether a given graph 
contains a Hamiltonian circuit is a special case of the traveling salesman problem. 
This, in turn, generalizes to the vehicle routing problem, which is the subject of 
section 2. 

A quite different class of routing problems emerges if one wishes to visit all of 
the edges (or arcs) of a graph rather than all of the nodes. The basic problem is 
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then to determine if a given graph contains an Eulerian tour, i.e., a closed walk 
that traverses each edge (or arc) exactly once. This generalizes to the Chinese 
postman problem, where one has to find the shortest closed walk that traverses 
each edge (or arc) at least once. It will arise in the solution of a practical 
multi-postman problem in section 3. 

The traveling salesman problem occurs in many more practical situations than 
sequencing measurements, clustering matrices, or routing vehicles. Ratliff and 
Rosenthal (1983) describe the problem of order picking in a rectangular 
warchouse. This is a traveling salesman problem that, due to the structure of the 
underlying network, can be solved in polynomial time by dynamic programming 
techniques. Other applications are discussed by Lenstra and Rinnooy Kan (1975). 

The traveling salesman problem has become the prototypical problem of 
combinatorial optimization. This is partly because its simplicity of statement and 
difficulty of solution are even more apparent than for most other problems in the 
area. In addition, many of the solution approaches that have become standard in 
combinatorial optimization were first developed and tested in the context of the 
traveling salesman problem. Our presentation of mathematical formulations, 
solution approaches and applications is only meant to be illustrative. A full 
treatment of the problem justifies a book of its own (Lawler ct al. 1985). 


2. Vehicle routing 


2.1. CAR 


In the period 1983-1986, the Centre for Mathematics and Computer Science 
(CWI) in Amsterdam was involved in the development of a computer system for 
vehicle routing. The resulting system is called CAR, which stands for ‘Computer 
Aided Routing’. Before we discuss the models and the algorithms that form the 
mathematical basis of CAR, we review some aspects of the practical background 
and the computer implementation in this section. The reader is referred to 
Savelsbergh (1992) for details. 

CAR has been designed for the solution of the single-depot vehicle routing 
problem with time windows. A problem situation of this type that occurred at the 
hanging garment division of a Dutch road transportation firm was our main source 
of information and motivation from practice. The situation is basically as follows. 
About fifteen vehicles are stationed at a single central depot and must serve about 
500 geographically dispersed customers. Each vehicle has as given capacity. Each 
customer has a given demand and must be served within a specified time interval. 
The travel times between the locations of the depot and the customers are given. 
We have to find a collection of routes for the vehicles, each starting and finishing 
at the depot and collectively visiting all customers, while respecting the capacity 
constraints of the vehicles and the time constraints of the customers. We would 
like to minimize the total travel time. 

Problems of this type and size. must be solved daily. There are several reasons 
why, at the present time, it is not possible to completely automate their solution. 
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On the one hand, the models that arise are hard, in a well-defined sense. Better 
solutions are obtained if the computer system and its user cooperate and divide 
the tasks in accordance with their respective — and complementary — capabilities. 
On the other hand, the problem situations are soft. Feasibility constraints can be 
stretched, and optimality on one criterion will be weighed against the values of 
secondary criteria. This is not the place to advocate the benefits of man—machine 
interaction in complex decision situations. Suffice it to refer to Anthonisse et al. 
(1988) and to mention that CAR has been designed and built as an interactive 
system, which does not make decisions but only supports decision making by the 
people who are in charge. The system is being used by several firms in the 
Netherlands. 


2.2. Model formulation 


We present an integer programming formulation of the single-depot vehicle 
routing problem. For the time being, we ignore the time windows of the 
customers. 

The data of the problem are as follows. There are m vehicles. The capacity of 
vehicle A is equal to QO, (h=1,...,m). The depot is indexed by i= 1 and the 
customers by i=2,...,2; the demand of customer i is equal to q; (=2,..., 7). 
Finally, there is a matrix (c,;); ;_, of travel times. As to the decision variables, let 

_fi if vehicle h visits customer / , 
Yni to otherwise , 

= e if vehicle h visits customers / and j in sequence , 
*hi~ 1Q otherwise . 


The problem is now to minimize 
m a 


a 
» Dy Ss CX ny 


h=Vi=l j=l 


subject to 


— m fori=1, 

Sa fori=2,.2...%, (2.1) 
by GY ni = Qn forh=1,...,m, (2.2) 
i=2 

Yn © (0, 1) forh=1,...,m,i=1,...,n, (2.3) 


De Bay = De Ea = Yo forh=1,...,m,i=l1,...,n, (2.4) 
i= i> 
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D x4 <|Ul-1 forh=1,...,m, UC{2,...,n}, (2.5) 
jEU 
Xp & (0, 1} forh=1,...,m, i,j=1,...,n. (2.6) 


The conditions (2.1) ensure that cach customer is allocated to one vehicle and 
that the depot is allocated to each vehicle. The conditions (2.2) are the vehicle 
capacity constraints. The conditions (2.4) ensure that a vehicle which arrives at a 
customer also leaves that customer. The conditions (2.5) are the subtour 
climination constraints, in a form that differs from (1.2). 

This formulation is due to Fisher and Jaikumar (1981). They observed that it 
consists of a number of interlinked subproblems, namely, a generalized assign- 
ment problem and a collection of m traveling salesman problems. The generalized 
assignment problem is the problem of minimizing. 


= LO aR) 


subject to (2.1)-(2.3). Here, the y,, are the decision variables and 
Si(Yars +> Yan) is the minimum time duration of a tour through the depot and 
the cluster of customers defined by {i| y,,=1}. These are traveling salesman 
problems, i.e., f, Ynys -- +> Yy_) iS to the minimum value of 


n a 


> > CyiX hij 


i=1y=) 


subject to (2.4)-(2.6). Here, the x,,, are the decision variables and the y,; 
prescribe the allocation of customers to vehicles. 

The traveling salesman problem is NP-hard, and so is the generalized assign- 
ment problem, ¢ven if its criterion function is linear in the y,,. However, these 
subproblems have been well studied. 


2.3. Solution approaches 


Fisher and Jaikumar originally proposed to solve the single-depot vehicle routing 
problem to optimality by an iterative process, which can be viewed as an 
application of Benders decomposition. Replacing each f,(yai,---> Yan) by a 
lower linear support, they solve the generalized assignment problem, which 
provides a lower bound on the overall solution value and a tentative clustering of 
the customers into vehicles. They then solve the m resulting traveling salesman 
problems, which yields an upper bound on the overall solution value and a 
tentative routing for each vehicle. In the second iteration, the generalized 
assignment problem is solved again, with an improved lower linear support 

derived from the solution obtained in the first iteration, and the process 
' continues. As soon as lower bound and upper bound are equal, an optimal 
solution has been obtained. 

This approach is notable for its conceptual value, not for its computational 
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efficiency. It motivated the development of an approximation algorithm. This is a 
cluster first-route second approach, which essentially consists of the first iteration 
of the optimization procedure. Much depends on the linearization of the functions 
f, that is chosen. Fisher and Jaikumar (1981) propose to select a seed point s(h) 
for each vehicle h; s(h) is a-customer who is centrally located in the area that is to 
be covered by vehicle h. They compute the “extra mileage costs” d,;=c,; + 
Cisthy ~ €1s(ny> Which should approximate the routing costs incurred if customer / is 
served by vehicle h. They then replace each f,(¥,,,-- ++ Yan) BY Miya dane and 
solve the linear generalized assignment problem. Finally, they solve a traveling 
salesman problem for each collection of customers allocated to the same vehicle. 

Large generalized assignment problems can be solved close to optimality 
(Fisher et al. 1986). Large traveling salesman problems are usually solved by edge 
exchange methods of the type discussed in section 1.3. 

We should draw the reader’s attention to a small but crucial problem that we 
encountered during the development of CAR. For an unconstrained traveling 
salesman problem, it takes constant time to process a single edge exchange, as 
long as the number of edges involved is bounded by a constant. In the presence of 
time windows, however, testing feasibility of the route that results from an edge 
exchange requires an amount of time that is linear in the number of cities. 
Savelsbergh (1990) developed techniques for implementing local search subject to 
time windows without an increase in overall time complexity. He extended these 
techniques to handle other side constraints such as multiple time windows per 
customer, mixed collections and deliveries, and precedence constraints. 


2.4. Related models and applications 


Vehicle routing problems can be modeled and solved in many different ways. 
Surveys are given by Bodin et al. (1983) and Christofides (1985). 

Desrochers et al. (1988) review the state of the art regarding routing with time 
window constraints. Next to standard vehicle routing problems with time 
windows, which are especially relevant in the context of school bus routing, they 
discuss pickup and delivery problems with time windows, which arise in dial-a- 
ride situations. The characteristic difference between the two problem types is 
that, in the latter case, pickup and delivery of the same commodity occurs in a 
single route. The models they present are based on integer programming, 
dynamic programming, and set partitioning. 

After this discussion of node routing problems, the next section deals with arc 
routing. The salesman is replaced by the postman. 


3. Multiple postmen 


3.1, Sprinkling highways 


In the winter, the highways in the Netherlands are sprinkled with salt to prevent 
them from becoming slippery. Some highways have built-in sensors, which 
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indicate when the road temperature drops below a certain threshold; in other 
cases, the critical point in time is determined by visual inspection. Safety 
regulations require that, when the signal for preventive sprinkling is given, all 
highways in a region should be handled within a time period of 45 minutes. The 
salt sprinklers are stationed at various depots along the highways. They can carry 
an amount of salt that suffices for a period of 45 minutes. When sprinkling, they 
drive at a reduced speed. The highway system is such that a road segment may 
have to be traversed without being sprinkled. How many sprinklers are needed, 
and how should their routes be constructed? 

This is a slight simplification of a problem that was handled by ORTEC 
Consultants in Gouda, in cooperation with the first author. The project resulted 
in a prototype program that, on a small problem instance, reduced the number of 


routes from seventeen to thirteen. Actual instances involve about 150 depots and 
500 routes. 


3.2. Model formulation 


Highways are one-way streets. The highway system is therefore modeled as a 
directed graph. The road junctions and the salt depots are the nodes, and the 
road segments are the arcs. The graph is strongly connected. Each arc has two 
weights, indicating the time needed to traverse the arc while sprinkling and while 
driving without sprinkling, respectively. A feasible solution is a collection of 
directed walks, one for each vehicle, such that each walk starts at one of the 
depot nodes, it is indicated for each occurrence of an arc in a walk whether it is 
sprinkled or not, each arc is sprinkled exactly once, and no walk exceeds a given 
upper bound in length. Note that a walk does not have to be closed; the time the 
vehicles need to return to their depots is irrelevant. A solution is optimal if the 
sum of the squared differences between upper bound and actual walk lengths is 
minimum. This criterion models the objective to make the walk lengths large on 
the one hand and more or less equal on the other. 

The sprinkling problem belongs to the class of arc routing problems, which was 
already mentioned jn section 1.4. In designing a solution method for our problem, 
we will relate it to the standard arc routing problem, the directed Chinese postman 
problem. This problem is formulated as follows: given a strongly connected 
arc-weighted directed graph, find the shortest closed directed walk that traverses 
each arc at least once. It is solvable in polynomial time, as will be indicated in 
section 3.3. 

Our problem is essentially a mudti-postman problem with some complexifying 
characteristics: there are several depots, to which the postmen do not have to 
return, and there is a quadratic cost function. Irrespective of the objective, the 


problem of deciding if two postmen stationed at one depot can do the job is 
already NP-complete. 


3.3. Solution approach 


We follow an approximative solution strategy. In contrast to the approach of 
section 2.3, it is a route first-cluster second algorithm. That is, we first construct a 
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single closed directed walk that contains all the arcs and has minimum length, and 
we then break it up into smaller directed walks, one for each vehicle. The routing 
problem is nothing but the directed Chinese postman problem; the clustering 
problem will be solved by dynamic programming. 

We first discuss the routing phase. The closed walk that is to be determined 
may have to traverse some of the arcs more than once. Suppose that, if it 
traverses an arc k times, we.add k —1 duplicate arcs to the graph. The closed 
walk is thereby transformed into an Eulerian tour, which traverses each are 
exactly once. Recall that a directed graph is Eulerian (i.e. has an Eulerian tour) if 
and only if it is strongly connected and, for each node, the indegrec is equal to the 
outdegree. The directed Chinese postman problem is therefore equivalent to 
finding a minimum-weight collection of duplicate arcs, the addition of which 
makes the indegree and outdegree of each node equal to each other. 

Let V* be the set of nodes for which the indegrce is larger than the outdcgree; 
let d; denote the difference for each iG V'. Let V~ be the set of nodes for which 
the outdegree is larger than the indegree; let d, denote the difference for each 
jEV. Note that Dicy: d? = Ljey d; . Further, let c,, be equal to the length of 
the shortest path from iGV" to j/EV~. The decision variables x,, will indicate 
the number of duplicates of the shortest path from iGV* toj EV that have to 
be added to the graph. The problem is to find x,, for which 


De CiXii 


iev* seve 
is minimized subject to 


> x,=4; forall fev", 
ievt 
= x,=d, foralliEV’, 


jEV™ 


x, ENU{0} forallieV*, fjEV. 


This is the linear transportation problem, which can be solved in polynomial time 
(see chapter 2). 

In the sprinkling problem, each arc has to be sprinkled only once, so that we 
have to compute the shortest path lengths c, using the arc weights that 
correspond to driving without sprinkling. We solve the linear transportation 
problem and add duplicate arcs to the graph in accordance with its optimal 
solution. The resulting graph usually contains many Eulerian tours. We select one 
using a heuristic rule, which incorporates various secondary criteria that are 
beyond the scope of this discussion. 

We now turn to the clustering phase. We choose a starting point of the Eulerian 
tour and list the arcs in the order in which they occur, say, @,,@,,...,4,,. Each 
arc a, has a weight w,. If there are several occurrences of the same arc, then the 
first one is assumed to be the original arc (and has the higher weight) and the 
others are the duplicates (with the lower weight). We will restrict our attention to 
walks that correspond to subsequences of (a,,4a,,..-,4,,). 
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Let W denote the given upper bound on walk length. For each arc a, 
(h=1,...,m), let », be the shortest distance from any depot to the tail of a,, 
and let J, be the set of indices i such that the subsequence (a,,...,a,) can be 
traversed by a single vehicle, i.e., y,+w,+---+w,<W for all iE/,. The 
minimum cost z,, of an optimal partitioning of the subsequence (a,,...,4,,) into 
feasible walks can now be computed by a simple recursion: 


z 0, 


m+i 


Zi; =min {((W-(y, tw, +--- +0) +2,4,} forh=m,m—-1,...,1. 
rely 


The optimum solution has value z,. Since walks starting with duplicate arcs may 
be disregarded, we can restrict the computation to those indices A for which a, is 
an original arc, and define z, =z,,, if a, is a duplicate arc. 

Although the directed Chinese postman problem in the routing phase and the 
partitioning problem in the clustering phase are both solved to optimality, the 
solution obtained is only approximate. This is due to the decomposition of the 
solution process into two phases and to the heuristic choice of an Eulerian tour. 
The entire algorithm runs in polynomial time. 


3.4. Related models and applications 


The Chinese postman problem was originally formulated on undirected graphs, by 
Guan (1962). It can be solved by shortest path and matching techniques; see, 
e.g., Lawler (1976). While both the undirected and the directed case can be 
solved in polynomial time, the postman problem becomes NP-hard if it is mixed, 
windy, rural, multiple, or capacitated, and also if the postman is replaced by a 
stacker crane. Lenstra and Rinnooy Kan (1981) and Johnson and Papadimitriou 
(1985) review complexity results and approximation algorithms for these variants. 

Shortest path algorithms are discussed by Lawler (1976). They are used as 
subroutines in many other combinatorial algorithms, e.g., for the linear trans- 
portation problem and the minimum cost flow problem (see chapter 2). 

Knuth and Plass (1981) describe an interesting applications of shortest paths 
that arose during the development of the TEX text processing system. A 
paragraph of text is to be broken into lines. Nodes correspond to feasible 
breaking points, and arcs to feasible lines. With each arc, a weight is associated 
that measures the quality of the breaks at its endpoints and of the line in between. 
The determination of these weights is not easy, but it is a typographical rather 
than a mathematical question. As the resulting directed graph is acyclic, a shortest 
path can be found by a very simple algorithm. 


4. Linear ordering 


4.1. Ranking priorities 


In 1970, a Dutch trade union was planning its policy for the future. Nine action 
worms were listed: 
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(1) increasing the retirement payments as well as the pensions for widows and 
orphans; ; 

(2) increasing the payment for loss of working hours due to frost; 

(3) increasing the holiday allowance; 

(4) increasing the pensions for widows and orphans; 

(5) introduction of capital growth sharing; 

(6) reduction of working hours; 

(7) increasing the number of days off; 

(8) education of young people at full pay; 

(9) increasing the retirement payments. 

Being a democratic organization, the trade union decided to involve its 
members. One thousand union representatives and 326 ordinary members were 
asked to rank the items in order of decreasing importance. But how does one 
aggregate 1 000 or 326 individual rankings into a single one? Anthonisse (private 
communication) proposed a simple and elegant model. 


4.2. Model formulation 


By ranking n items, an individual expresses n(n — 1)/2 preferences, one for each 
pair of items. One way to evaluate an overall ranking is by counting the total 
number of individual preferences that are consistent with it. 

Suppose c,, is the number of people who prefer item i to j, fori, j=1,...,n. 
An ordering p of {1,...,} defines a priority ranking in the sense that / is 
ranked higher than j if p(é)<p(j). The total number of preferences that are 
consistent with a ranking p is given by 


> Ci ’ 


tg pl y<pG) 


and the problem is to find a ranking p that maximizes this number. This is the 
linear ordering problem. \t generalizes the feedback arc set problem (sec section 
4.4) and is therefore NP-hard. 

A formulation in terms of 0-1 variables is casily obtained. Let x,,= 1 indicate 
that p(i) <p(j). The problem is then to maximize 


n n 
2 2 CPi 
j= 


subject to 
x,;,=0 fori=1,...,n, (4.1) 
X, tx, = 1 fori, j=i,...,n, i<j, (4.2) 
Xj tX_tXyjS2 fori, j,k=1,...,n,i<jf<k ori<k<j, (4.3) 


x, € {0, 1} fori, j=i,...,n. (4.4) 
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The conditions (4.3) represent the transitivity of p: if we rank i above j and j 
above k (x,, =x), = 1), then we rank i above k (x,,=0, 80 Xj = 1). 


4.3. Solution approaches 


For the trade union’s problem, Anthonisse replaced (4.4) by x, € {0,1} by 
x, 20, solved the resulting linear programming problem, and obtained an integral 
solution. It has been observed more often that, for this problem type, the linear 
programming relaxation gives an optimal solution to the integer program or at 
least an excellent upper bound. 

The best available algorithm for the linear ordering problem uses polyhedral 
techniques, very much in the spirit of the polyhedral approach to the symmetric 
traveling salesman problem (see section 1.3). The constraints (4.3) define facets 
of the linear ordering polytope, but, again, more classes of facets have been 
identified. Reinelt (1985) describes this approach in detail. He also reviews earlier 
optimization and approximation algorithms for the linear ordering problem. 


4.4. Related models and applications 


Another application of the linear ordering problem is the triangulation of input— 
output matrices. Here, there are m industry sectors and c,, denotes the supply 
from sector i to sector j. The sectors have to be ordered ‘“‘from raw material to 
consumer”’. 

The linear ordering problem is equivalent to the acyclic subgraph problem: 
given a directed graph G = (V, A) with weights associated with the arcs, find an 
acyclic directed graph G' = (V, A‘) with A’ C A such that the sum of the weights 
of the arcs in A‘ is maximum. If all arc weights are equal, the problem reduces to 
the feedback arc set problem: given a directed graph G = (V, A), find a minimum- 
cardinality set of arcs that intersects each directed circuit in G. We leave it to the 
reader to sort out the details and refer to Jinger (1985) and Reinelt (1985) for 
further information on models, algorithms and applications. 


5. Clique partitioning 


5.1. Distinguishing types of professions 


In 1969, the Interfaculty of Actuarial Sciences and Econometrics of the University 
of Amsterdam wished to revise the curriculum in econometrics. “Econometrics” 
is used here in a broad sense and includes mathematical economics, empirical 
econometrics, statistics, and operations research. A committee was installed to 
investigate what professional econometricians in practice would demand of a 
curriculum. 

The committee interviewed 45 econometricians employed in government, 
industry and consultancy. Each of them was given a list of 24 problem situations 
and activities and a list of 24 methods and techniques, and was asked to indicate 
which problems he had been working on during the last two years and which 
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techniques he had applied. The committee then first analyzed the lists of problems 
to determine which types of professionals could be distinguished. Secondly, the 
lists of techniques were used to design a curriculum for each type. It should be 
noted that no one intended to directly implement the results. There are many 
reasons why an actual university curriculum could differ from what present 
practice views as desirable. 

The first problem was modeled and solved by techniques from combinatorial 
optimization; details will be given in sections 5.2 and 5.3. The result was a clear 
distinction between two groups: the “measurers” (proper econometricians) and 
the “regulators” (operations researchers). The second problem was a statistical 
exercise, which need not concern us here. A full account is given by Cramer et al. 
(1970). 


5.2. Model formulation 


Given are a set V of persons and a set P of problem situations. For each person 
i€V there is a set P, C P of problems on which i has worked. For each problem 
p &P, there is a set V, C V of persons who worked on p, with V, = {i: p © P,}. We 
will write V= {1,. sn), P,= P\P, (i GV), and V, = VV, (pe P). 

We wish to partition V into a number of mutually ‘disjoint groups in such a way 
that two persons i and j are allocated to the same group if P; and P, are similar 
and to different groups if P, and P, are dissimilar. In other words, the partition X 
of V we look for should be a reasonable aggregation of the given partitions 
(V, JV ,) (p & P). In the case that |P| = 1, we could take X =(V,, V,), and there 
would. be complete agreement between input and output. In the general case, we 
choose to minimize the sum, over all problems p€P, of the number of 
disagreements between X and (V,, V,). 

The data can be represented by numbers Vin (A Si<j <a, p€ P) such that 
Yijp = 1 if i and j are in agreement regarding problem p, i.e., p © (P,A P;) U (PA 
P,), and y,,, = 0 otherwise. Similarly, X can be described by decision variables Xj 
d <i<j<n) such that Xi =1 if ¢ and j are allocated to the same group and 
x, = 0 otherwise. Since z’=z for any z€ {0,1}, the total number of disagree- 
ments can now be written as 


Dy DS by —Yi)? 


PEP lsi<jsa 


=> > (i- Yip ij me 2 > Yijp 


Poi<j 
=> cx,+C, 
i<j 
where 
ey=d (1— 2y,,) = [P| -2(P, P| +P. Pl), 


C= LD Vip = AP, P| + |P.0 Pl). 
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- constant term C can be dropped. The problem is then to minimize 


Dd ciky 


l<i<jen 
Subject to the condition that the x,, define a partition of V: 
Xi t Xjq — Xin SI fori, j,kK=1,...,n, i<j<k, 
Xi ~ Xjg + Xig S 1 fori, j,kK=1,...,n,i<j<k, 
—xXj, +X, +4, 51 fori, j,kK=1,....n, i<j<k, 


x, © {0, 1} fori, j=1,...,4, i<j. 


Note that the problem has a trivial solution if all c,, have the same sign. 

The problem can also be formulated in terms of graphs. Consider the complete 
graph K, =(V,£) on n nodes with a weight c,€Z for each edge eG E. The 
problem is to partition V into non-overlapping subsets so as to minimize the sum 
of the weights of the edges whose ends are in the same subset. This is called the 
clique partitioning problem, and it is known to be NP-hard. 

The above discussion follows Grétschel and Wakabayashi (1989). Note that the 
number of groups in the partition is not specified in advance but computed as part 
of the solution. 

Let us briefly consider the case in which the number of groups is not free but 
fixed to, say, m. This clique m-partitioning problem can be stated as follows: given 
the complete graph K, = (V, E) on n nodes with a weight c,€ Z for each edge 
e€ E, color each node with one of m colors so as to minimize the sum of the 
weights of the edges whose ends receive the same color. This formulation 
generalizes the graph coloring problem, where alt c, are equal to 0 or 1 and we 
are interested in the existence of an m-coloring with value 0. The graph coloring 
problem is solvable in polynomial time for m = 2 and NP-complete for any m 2 3. 
The clique 2-partitioning problem is also known as the max cut problem, which is 
already NP-hard in itself. 

Carlson and Nemhauser (1966) were the first to consider the clique m- 
partitioning problem. They gave a quadratic programming formulation. Let 
X,; = 1 indicate that node i receives color h. The problem is then to minimize 


7 > >, > CEM niX nj 


holé ty 


subject to 


m 


> eel fori=1,...,n, 


hel 
x,;© {0,1} forh=1,...,m,i=1,...,n. 


It is not hard to see that, if x,, € {0, 1} is replaced by x,, 20, then there exists an 
integral optimal solution. It can also be proved that an integral feasible solution 
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satisfies the Karush-Kuhn—Tucker conditions if and only if it cannot be improved 
by giving any single node another color. 


5.3. Solution approaches 


Gr6étschel and Wakabayashi (1989) developed a cutting plane algorithm for the 
clique partitioning problem, along the same lines as the polyhedral approaches 
mentioned in sections 1.3 and 4.3. They were able to solve problem instances with 
up to 158 nodes, without ever having to resort to tree search. 

In 1969, when the problem at the University of Amsterdam occurred, a more 
heuristic approach was taken. First, for various values of m, the number of groups 
was fixed at m and a good partitioning into m groups was computed. Secondly, 
the results were analyzed and an appropriate value of m was determined. The 
weights were defined by 


_ {PL = [P,P] —0.9/P,9 Bi 
aa [P,P | +1 


Note that a positive agreement between i and j counts more heavily than a 
negative agreement. This choice was made after some experiments with small 
problem instances. The results obtained were, however, quite robust with respect 
to small perturbations of the weights. 

None of the optimization algorithms for the clique m-partitioning problem that 
were available in 1969 could handle instances with n = 45. An iterative improve- 
ment procedure was developed, with & (1 <k <n) as an input parameter. At each 
step, the existing partitioning is considered and the k persons are determined 
whose individual transition to another group would result in the largest decrease 
of the criterion value. These k persons are then optimally reallocated by a branch 
and bound algorithm. If no further improvements are possible, a local optimum 
has been obtained, which could be called k-opt. As mentioned before, a solution 
is 1-opt if and only if it satisfies the Karush-Kuhn—Tucker conditions of the 
Carlson—Nemhauser formulation. For m = 2, this procedure always produced the 
same local optimum, for any starting solution and for any value of k between 1 
and 20. For m = 3, several local optima were obtained, but their criterion values 
are close together. The same is true for m = 4. 

The question is now how to determine the best value of the number m of 
groups. It is obvious that, if more groups are added, the optimal criterion value 
will decrease. Cramer et al. (1970) used a simulation experiment to estimate the 
decreases that can be solely ascribed to enlarging the number of groups. They 
generated a number of random surveys among 45 “colorless” people and 
computed locally optimal solutions for these into two, three and four groups. If 
one goes from one to two groups, then an average of 42 percent of the total 
weight remains, with a very small variance, as compared to only 30 percent in 
case of the actual survey data. If more groups are added, the decreases for the 
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simulation and for the survey are about the same. It was concluded that there is a 
clear distinction between two, but no more than two, groups. 


5.4. Related models and applications 


The relation of the clique m-partitioning problem to the graph coloring problem 
and the max cut problem has already been pointed out. 

A variant of the model occurs if upper bounds on the group sizes are specified. 
Arnold et al. (1973) describe an application that arises when one is organizing a 
scientific meeting: nodes correspond to sessions, weights to conflicts between 
sessions, and colors to time slots. They used a combination of random sampling 
and local search. 

Kernighan and Lin (1970) consider the problem of partitioning V into two 
equal-size subsets so as to minimize the total weight of the edges whose ends are 
in different subsets. They propose a variable-depth search method, which is a 
precursor of their algorithm for the traveling salesman problem (see section 1.3). 

The clique partitioning problem serves as a model for a variety of clustering 
problems. Grétschel and Wakabayashi (1989) give several examples, such as the 
classification of animals with respect to morphological and behavioral characteris- 
tics, and the classification of member countries of the United Nations on the basis 
of voting behavior. All these problem situations can be captured under the 
general heading of the aggregation of binary relations, which is a central topic of 
interest in the area of qualitative data analysis. 


6. Test cover 


6.1. Recognizing diseases 


In 1979, the CWI received a request from the Department of Mathematics at the 
Agricultural University in Wageningen: “Enclosed you will find a 0-1 matrix B 
with 63 rows and 28 columns. We define a 0-1 matrix A with 63 rows and 378 
columns. Each column of A corresponds to a pair of columns of B (note that 
378 = 28 - 27/2) and is obtained by adding those columns modulo 2. We would be 
interested in a solution to the set covering problem on A™.” 

The set covering problem is here the problem of finding a minimum number of 
rows of A such that, in each column, at least one of these rows has a 1. The 
difficulty of solving large set covering problems as well as our professional 
curiosity motivated us to transform backwards. We identified the rows of B with 
tests, the columns with items, and each entry (é, /) with the result of test i applied 
to item j. The problem is then to find a minimum number of tests such that, for 
each pair of items, at least one of these tests distinguishes the two items. This is 
the test cover problem. 

Before discussing these models and their relation in more detail, let us clarify 
the practical background. It turned out that tests and items were plant varieties 

- and plant diseases, respectively. A minimum number of varieties that discrimi- 
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nates between all diseases should provide an efficient and economical setup for 
recognizing diseases. The problem arose as part of a project carried out at the 
Research Institute of Plant Protection in Wageningen in cooperation with the 
International Maize and Wheat Improvement Center (CIMMYT). We provided a 
program that incorporates the algorithm described in section 6.3. It computed an 
optimal solution of seven tests for the above instance (the best known solution 
used eight) and has been used successfully on many other instances. 


6.2. Model formulation 


Given is a finite set N and a family 7 of subsets of N; N contains the items and 7 
is the collection of tests. A fest cover is a subfamily 7'’C J such that, for each 
pair {j,k} CN, there is a test TE J’ such that T distinguishes between j and k, 
i.e., [TO {j, k}| =1. The problem is to find a test cover of minimum cardinality. 

A more familiar problem is the set covering problem: given a finite set M and a 
family ¥ of subsets of M, determine a minimum-sizc subfamily /’ C ¥ for which 
U seg S=M. 

As suggested above, we can formulate each instance of the test cover problem 
as a set covering problem. We define an element in .4 for each pair of items in N, 
and we create a subset S € ¥ for each test T € J; the elements in S are precisely 
the pairs of items in N that are distinguished by 7. It is immediate that a 
subfamily /’ C # covers M if and only if the corresponding subfamily 7’ C J isa 
test cover. 

Through a relation with an optimization problem on graphs, which we will 
briefly discuss in section 6.4, it turns out that the test cover problem is already 
NP-hard if {7| <2 for all TE 7. 


6.3. Solution approaches 


Any algorithm for set covering can be used for the test cover problem. However, 
the quadratic blowup is not encouraging, and it even appears that the resulting set 
covering instances are particularly hard ones. Although all NP-hard combinatorial 
optimization problems are polynomially equivalent, it generally pays off to 
develop an algorithm that is specific to the problem at hand. 

The problem instance described in the introduction was solved to optimality by 
a straightforward combination of approximation and branch and bound. In the 
first phase, a greedy algorithm constructs a reasonable solution and a local search 
algorithm tries to improve on it. The greedy algorithm selects, at each step, the 
test that distinguishes the greatest number of pairs that are not yet distinguished 
by the tests selected so far. Iterative improvement then produces a good solution, 
the value of which serves as an initial upper bound for the branch and bound 
process. In this second phase, the tests T € J are put in non-increasing order of 
“distinguishing power” min{|7|,|N — 7|} and the subfamilies of 7 are cnumer- 
ated in lexicographic order. Subfamilies are eliminated by a lower bound which is 
based on the simple observation that, for distinguishing n items, at least [log,n] 
tests are needed. 
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One of the first papers on the test cover problem is by Moret and Shapiro 
(1985). They investigate the worst-case behavior of approximation algorithms and 
the derivation of lower bounds. A negative result is that the greedy algorithm can 
perform as badly as for the general set covering problem and produce solutions 
that are off by a factor of @(log|N|). The above project has inspired further work 
on approximation and optimization algorithms for the test cover problem. 


6.4. Related models and applications 


An interesting special case of the test cover problem occurs if all tests have 
cardinality 2. In terms of a graph G with node set N and edge set J, a test cover 
is an edge subset 7'C J such that no two nodes in N have the same incidence 
relations with respect to 7’. This turns out to be equivalent to the requirement 
that the subgraph G’ = (N, 7’) has no isolated edges and at most one isolated 
node. One can establish a strong relation between this problem and the problem 
of finding a maximum number of node-disjoint paths of length 2 in a graph. This 
implies NP-hardness of the restricted test cover problem and has interesting 
consequences for the worst-case behavior of approximation algorithms. 

Moret and Shapiro (1985) mention applications of the test cover problem in 
fault testing and diagnosis, pattern recognition, and biological identification. 


7. Bottleneck extrema 


7.1. Locating obnoxious facilities 


In a regional development plan, a number of sites has been identified at which 
residual quarters as well as industrial areas will be located. The industrial areas 
will accommodate obnoxious facilities. The problem is how to select sites for the 
industrial areas so as to minimize the inconvenience caused to the residual 
quarters. 

Our presentation follows Hsu and Nemhauser (1979). We did not encounter the 
problem in practice, but we decided to include it here because it nicely illustrates 
the elegant theory of bottleneck extrema. 


7.2. Model formulation 


Let us assume that there are n sites, k of which will be industrial areas. We 
construct a graph G=(V, E) with V={1,...,”} and {i,j} EE if and only if 
sites i and j influence each other, i.e., if an industrial area at one of the sites 
would be a nuisance to a residual quarter at the other site. With each edge 
e = {i, j} EE, a weight d, is associated, representing the distance between sites i 
aad j. 
~eall that, for all U CV with U #@ and U €V, &(U) is the set of edges with 
~e.end in U. The set 6(U) is called a cut; its removal from G disconnects 
7 and V\U. If |U| =k, then S(U) will be called a k-cut. 


rr 
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One way of minimizing annoyance is to maximize the minimum distance 
between any residential quarter and its closest industrial area. The problem is 
then to find a k-cut-whose minimum edge. weight is maximum: 


max min d.. (7.1) 


usu, kecamy * 


We present a polynomial-time. algorithm for this problem in the next section. 


7.3. Solution approach 


The solution of (7.1) is based on the theory of bottleneck extrema developed by 
Edmonds and Fulkerson (1970). We outline some of their results below. 

Let S be a finite set. A clutter € on S is a collection of subsets of S such that no 
set in € includes any other set in €. The blocker B of € is the collection of 
subsets of ¥ that intersect all sets in @ and are minimal under inclusion. Note 
that B is again a clutter. 

Edmonds and Fulkerson proved two duality results. First, € is the blocker of 
its blocker &. Secondly, for any real-valued function f on S, 


may mi fala ma A (7.2) 


They also presented a simple threshold algorithm for the max—min problem and 
an analogous dual threshold method for the min~max problem. The choice 
between these two algorithms depends on the relative efficiency of recognizing 
members of @ and &%. We will describe the dual method. 

Suppose that the elements of S are indexed so that f, <f, <---<f\,,. Suppose 
further that e* © § is such that {1,...,e*—1} docs not include a member of B 
but that {1,...,e*}2B* for some B*EG%. Then B* solves the min-max 
problem. By bisection search over S,e* can be found in O(log|S|) iterations, 
where each iteration tests for inclusion of a member of &. 

We now return to problem (7.1). It suffices to consider only minimal k-cuts: if 
E'CE", then min,.,..d, 2 min,.,.d,. We let S corresponded to the edge set EF, f 
to the weight function d, and € to the collection of minimal k-cuts; note that € is 
a clutter. Problem (7.1) can now be stated as maxceq MiN,c¢ f- 

Rather than solving the recognition problem for members of €, we will 
characterize its blocker @ and give an O(n’) membership test for %. We thus 
obtain an O(n’ log n) algorithm for computing min,_g max,_, f, and, by (7.2), 
for solving (7.1). 

The crucial observation to make is that a subset BC E intersects all k-cuts if 
and only if its complement E\B is not a k-cut. It immediately follows that B 
contains the minimal B C E such that no collection of connected components of 
the graph G, = (V, B) contains exactly k nodes. 

As to the membership test, consider some E’ C E. Suppose G,' = (V, E’) has m 
components and let component / have a, nodes, forh=1,...,m. Then E'DB 
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for some B € & if and only if 


Bs a,xX,=k, 
hal 
x,€& {0,1} forh=1,...,m 


has no solution. This can be tested by dynamic programming in O(km) = O(n’) 
time. 


7.4. Related models 


Discrete location theory is one of the cornerstones of combinatorial operations 
research. Its basic problem types are the following. In the uncapacitated plant 
location problem, one has to find a set of locations such that the sum of the setup 
costs for facilities and the transportation costs between customers and facilities is 
minimized. In the k-median problem, one has to locate k facilities so as to 
minimize the sum of the distances between each customer and its closest facility, 
while in the k-center problem, the maximum of these distances is to be minimized. 

Mirchandani and Francis (1990) collected a number of survey articles on 
discrete location theory. 


8. Minimum cost flow 


8.1, 2-dimensional proportional representation 


Gooi en Vechistreek is a region in the Netherlands, just east of Amsterdam. There 
is a regional council, the members of which are appointed by and from the 
participating local councils. The composition of the regional council should be a 
fair representation of the local interests as well as of the political views in the 
region. 

Each local council has an odd number of members. It is prescribed that a 
quarter of them, rounded to the nearest integer, is appointed to the regional 
council. This should take care of a proportional representation of local interests. 
However, if the allocation of seats to political parties is left completely to the 
local councils, then there is an obvious danger that overall disproportionalities in 
the representation of political views will occur. Coordination is required. 

It has been agreed that the chairman of the regional cooperation uses the 
outcome of the local elections to determine the number of seats to be allocated to 
each party from each local council. The result of his reflections has the status of 
an advice to the local councils. In current practice, he applies the method 


proposed by Anthonisse (1984) in arriving at his advice. This method is described 
below. 
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8.2. Model formulation 


We first briefly consider the question of 1-dimensional proportional representa- 
tion. This problem has attracted a lot of interest, in politics as well as in 
mathematics. It can be formulated as follows. 

Suppose there are P parties, V votes, and S seats. Party p has received », votes, 
with 7 p=1 ¥) = V. Ideally, party p should receive s, = Sv,/V seats. However, the 
5, are generally non-integral and have to be rounded. Let a bivariate function f 
be given, where f(s,, x,) measures the distance between the ideal number of seats 
s, and the actual allotment x,. The problem is then to find x,,..., x, such that 


a x,=S, (8.1) 


x, ENV {0} forp=1,...,P. 


Various methods for solving this problem have been proposed. Hamilton’s 
method of the greatest remainders ie ate to f(s,,x,) = kx, — s,|, Jefferson’s 
method of the greatest divisors to f(s,, = (x, — (s, — $))"/s,, and Webster’s 
method to f(s,, x,) = (x, — 5 a /s,. Fora histories and mathematical overview of 
dimensional apatioonal representation, we refer to Balinski and Young 
(1982). They list a number of properties a perfect method of apportionment 
should satisfy and show that no such method exists. They argue convincingly that 
Webster’s method comes closest towards ‘‘meeting the ideal of one man, one 
vote”’. 

As to the problem of 2-dimensional proportional representation, suppose there 
are P parties and M municipalities. In the local council of municipality m, party 
p has »,,, seats. The size of council im is Vu = a, 1%,» the regional strength of 
party p is V,= =p ae ap» and V= bate Mee Mp Let w,, = L(V... + 1)/4] 
denote the number of members of council m that are to be ‘appointed in the 
regional council, and let S = ee m= W,, be the total number of seats in the regional 
council. Ideally, party p should receive s, = SV_,/V regional seats, £,,,, = Sv,,,/V of 
which should come from municipality m. The decision variables are x,, the actual 
allotment to party p, and y,,,, the number of members of party p to be appointed 
from the council of municipality m. In the formulations below, f can be any 
convex bivariate distance function. 

The problem is solved in two stages. First, we find the x, such that 


2 SS pop) 
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is minimized subject to 


x Ymp =~Wm form=1,...,M, (8.2) 
oe 

M 

a Ymp =X, forp=1,...,P, (8.3) 
Yimp = Pnp form=1,...,M, p=1,...,P, (8.4) 
Ymp ENU {0} form=1,...,p=1,...,P, 

x,ENU {0} forp=1,...,P. (8.5) 


Secondly, given the x,, we find the y,,, such that 


Me 


P 
» Slinp> Ymp) 
Lp=t 


is minimized subject to (8.2)—(8.5). 

The first problem obviously has a feasible solution: any sample of w,, from the 
V,,. members satisfies (8.2), (8.4) and (8.5), and (8.3) then defines the x,. Given 
these x,, the second problem is also feasible, as the constraints remain the same. 
The reader may wonder why we have not simplified the first stage by relaxing 
(8.2)-(8.5) into (8.1). The reason is that this may yield an infeasible second 
stage. 

Both problems can be modeled in terms of minimum cost flows (sce chapter 2) 
and as such be solved in polynomial time. At the first stage, we define a network 
with a source a, a node m for each municipality, a node p for each party, and a 
sink w. There are arcs (a,m) with given flow values w,,, arcs (m, p) with 
capacities y,,,, and unconstrained arcs (p,w); f(s,,x,) denotes the cost of 
sending x, units of flow through the arc (p, w). A feasible solution (&,), (Yinp)) 
now corresponds to a flow of value S from @ to w, with flow values y,,, through 
the arcs (m, p) and x, through (p, w). The conditions (8.2) and (8.3) are flow 
conservation constraints at the nodes m and p; the conditions (8.4) represent the 
capacity constraints of the arcs (m, p). We have to find a feasible flow that 
minimizes the total costs of the flows x, through the arcs (p,). At the second 
stage, we use the same network, except that the arcs (m, p) have costs fap, Ynp) 
and the arcs (p, w) have given flow values x,. We now have to find feasible flows 
Ymp through the arcs (m, p) of minimum total costs. 


8.3, Solution approaches 


Minimum cost flow problems are treated in depth in chapter 2. Minoux (1986) 
gives a polynomial-time algorithm for finding minimum cost integral flows with 
separable convex cost functions. 
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8.4. Related models 


Following Anthonisse’s work, Balinski and Demange (1989) pursued an ax- 
iomatic approach to two problems, one of which generalizes the problem 
considered above. Given are a nonnegative matrix V = (v,,,) and a positive integer 
S; one may give v,, and S the same interpretation as before. In addition, 
nonnegative integers Wins Wr Xp sa are given. An allocation is defined as a 
matrix Y=(y,,,) of the same dimensions as V with w,, < ae Ymp =Wm> x, = 
im Ymp Xp» and Y,, Li, Ymp = 5S. What should it mean to say that an allocation 
Y is proportional to V? And what should this mean for an apportionment, i.e., an 
integer allocation? 


9. Interval scheduling 


9.1. Dealing with nasty clients 


A Dutch firm, primarily engaged in the retail trade, had decided to diversify and 
had acquired a large number of summer cottages. A client can make a reservation 
at any one of the firm’s branches and is immediately told whether a cottage is still 
available for the period (s)he is applying for. Only at a later stage is it determined 
in which cottage each accepted client will spend the holidays. This procedure gave 
rise to a number of questions. 

Does there exist a simple rule that indicates whether a client can be accepted? 
Yes, there does, as we will clarify below: cottages can be assigned to clients in 
their desired periods if and only if, at any time, the number of clients is no greater 
than the number of cottages. How about a method that assigns the accepted 
clients to a minimum number of cottages? This exists as well: assign the clients to 
cottages in order of their starting times, giving priority to cottages used before. 

Both questions could be answered during the first contact with the firm’s 
employee who sought the advice of the CWI. All seemed well until, while leaving, 
a trivial complication crossed his mind: a client can reserve a specific cottage by 
paying Hfl. 25 upon application and is then preassigned. This appears to have a 
dramatic effect on the problem’s computational complexity. The above necessary 
and sufficient condition for acceptance remains valid only under the assumption 
that the clients would be willing to move into another cottage now and then. But 
under the more realistic assumption that these people do not want to move when 
they are on holiday, the problem turns out to be NP-complete. 

We never heard from our client again. The complications caused by the nasty 
clients are probably trivial indeed and do not prohibit the application of the 
existing methods. 

The above account follows Anthonisse and Lenstra (1984). We will deal with 
the technical details below. 


9.2. Model formulation 


We rephrase the problem in scheduling terminology. Cottages will be represented 
by machines and clients by jobs. There are m identical parallel machines, each of 
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which can handle at most one job at a time. There are n independent jobs j, 
which need processing on one of the machines during the time interval (s;, 6) 
(Vj=l,..-.,2). First of all, we are interested in the minimum number of machines 
needed to process all jobs. The solution of this problem goes back to Dantzig and 
Fulkerson (1954). 

Let us define a partial order—» on the set of jobs. We say that j— & whenever 
1; <5,, i.e., when job j is completed before job k starts. A chain in the job set is a 
subset {j,,j.,--->J,} with j,>j,—2---—J,. The jobs in a chain can be 
consecutively scheduled on one machine, and, conversely, any schedule on one 
machine corresponds to a chain in the job set. The minimum number of machines 
needed to process all jobs is therefore equal to the minimum number of chains 
into which the job set can be partitioned. 

Jobs j and k are unrelated if neither j,k nor k—>j. An antichain is a set of 
pairwise unrelated jobs. Any two jobs in antichain overlap in time; by a property 
of intervals, this is equivalent to saying that all jobs in an antichain overlap at a 
certain time. 

We now invoke Dilworth’s chain decomposition theorem: for every partially 
ordered set, the minimum number of chains needed to cover all elements is equal 
to the maximum number of elements in an antichain. Or, the minimum number of 
machines needed to process all jobs is equal to the maximum number of jobs that 
require simultaneous processing. For fast algorithms that actually assign the jobs 
to a minimum number of machines, we refer to section 9.3. 

Another way of modeling the interval scheduling problem is in terms of interval 
graphs. Associated with the job set, we define the interval graph G = 
({1,...,a}, E) where {j,k} € E if and only if jobs j and k overlap in time. We 
recall that a clique in a graph is a subset of pairwise adjacent nodes, a stable set is 
a subset of pairwise non-adjacent nodes, and the chromatic number is the smallest 
k for which the node set can be partitioned into k stable sets. Clearly, a clique of 
G corresponds to an antichain in the partially ordered set, a stable sect of G 
corresponds to a chain, and the chromatic number of G is equal to the minimum 
number of machines that can accomodate all jobs. 

For interval graphs, it is true in general that the chromatic number is equal to 
the maximum clique size. This result parallels Dilworth’s decomposition theorem 
for partially ordered sets. A minimum coloring and a maximum clique in an 
interval graph can be found in polynomial time. We refer to chapter 4 for details. 

We now turn to the situation in which some clients have been preassigned to 
specific cottages during certain periods. Machines will now correspond to maximal 
idle periods of the cottages and jobs to assigned clients. Machine i is available 
during the interval (@;,,b,) (i=1,...,m); job j requires processing during the 
interval (s,,4,) (/=1,-.., 2). Note that, in contrast to the previous problem, the 
machines are not identical anymore. The question whether a client can be 
accepted boils down to the following problem: is it possible to pack the intervals 
(s;, 1,) into the intervals (a,, b,)? 

We may try to gencralize the partial order model to this situation. We introduce 
dummy jobs n+i requiring processing in (—~,a,) and n+m+i_ requiring 
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processing in (b;, ©), for’=1,...,m. Again, we write j— k if and only if job j is 
completed before job k starts. A feasible schedule corresponds to a decomposi- 
tion of the job set into m chains, where the ith chain starts with job 7 + i and ends 
with job n+m+i (i=1,...,m). Conversely, because {1 +1,...,%+m} and 
{n+m+1,...,n+2m} are antichains, any chain in a decomposition of the job 
set into m chains must start at some a +i (1 <i<m) and end at some n+m+i' 
(1 <i'<m). Unfortunately, there is nothing to guarantee that ¢=1', and there- 
fore a chain does not necessarily correspond to a schedule on one machine. It is 
not hard to see, however, that from such a chain decomposition a preemptive 
schedule can be constructed, in which the processing of a job may be interrupted 
on one machine and continued on another machinc. 

The nonpreemptive problem appears to be much harder. Kolen et al. (1991) 
give a polynomial-time algorithm for the case of fixed m. They also prove that the 
general case is NP-complete, by relating the problem to a generalization of 
interval graph coloring. 


9.3. Solution approaches 


We restrict ourselves here to algorithms for the interval scheduling problem on 
identical machines. 

Ford and Fulkerson (1962) give a simple O(n’) algorithm for decomposing a 
partially ordered set into chains. This so-called staircase rule finds a minimum 
number of chains in the case that the elements can be numbered so that j <k 
implies that all predecessors of j are included in those of k. This condition holds 
for the partial order defined on the job set. The rule works as follows. Find the 
smallest element, in terms of the numbering. Repeatedly, find the smallest 
successor of the last element found, until no successor exists. Delete the chain 
that has been found, and repeat the process. 

Gupta et al. (1979) give an O(n log) algorithm, which builds the chains in 
parallel rather than in series. This rule, which was informally stated in section 9.1, 
is as follows. Put all of the machines on a stack S of idle machines. Order the s, 


and ¢; (j= 1,...,) in nondecreasing order, where a ¢, precedes ans, in case of a 
tie; this yields a nondecreasing sequence wu,,U,,...,U,,. Then, for k= 
1,...,2n, do the following: if u, corresponds to s,, then assign job j to the 


machine on top of S, and remove this machine from S; if u, corresponds to ¢,, 
then put the machine to which job j was assigned on top of S. 


9.4. Related models 


Several generalizations of the interval scheduling problem on identical parallel 
machines have been investigated. 

Arkin and Silverberg (1987) analyze the case in which there is a weight 
associated with cach job and a maximum-weight subset of jobs that can be 
scheduled on m machines is to be found. They develop an O(n’ log 7) algorithm. 

Fischetti et al. (1987, 1989) consider two problems types. In the first one, each 
machine is available for a period of length b, which starts at the starting time of 
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the first job assigned to it. In the second problem type, each machine can perform 
no more than 6 time units of processing. In both cases, the number of machines is 
to be minimized. They show that both problems are NP-hard and developed 
branch and bound algorithms for their solution. 

Another extension involves hierarchies of machines and jobs. There are m 
classes of machines and m classes of jobs. All machines are available during the 
same time interval. A job in class i requires processing during a given interval and 
can Only be assigned to machines in classes 1,...,i. Does there exist a feasible 
schedule? Kolen et al. (1991) give a polynomial-time algorithm for the case that 
m=2, using network flow techniques, and show that the case m=3 is NP- 
complete. Subsequent work concerns optimization versions of this problem, 
where costs are associated with the machines. 


10. Job shop scheduling 


10.1. Production planning 


Combinatorial optimization problems that arise in production planning tend to be 
both difficult to formulate and difficult to solve. That is, the problem is often 
characterized by constraints that are very specific to the situation at hand, and it is 
usually an easy matter to find many independent reasons for its NP-hardness. 
These observations may explain why the development and application of general 
software in the area of production planning is not nearly at the stage at which it is 
in vehicle routing. 

In order to avoid complicating details, we have chosen to consider a standard 
problem type, the job shop scheduling problem, which is at the core of many 
practical production planning situations. It is described as follows. Given are a set 
of jobs and a set of machines. Each machine can handle at most one job at a time. 
Each job consists of a chain of operations, each of which needs to be processed 
during an uninterrupted time period of a given length on a given machine. The 
purpose is to find a schedule, i.e., an allocation of the operations to time intervals 
on the machines, that has minimum length. 

This problem allows a number of relatively straightforward mathematical 
formulations. In addition, it is extremely difficult to solve to optimality. This is 
witnessed by the fact that a problem instance with only ten jobs, ten machines and 
one hundred operations, published in 1963, remained unresolved until 1986. 


10.2. Model formulation 


Given are a set ¥ of jobs, a set “& of machines, and a set © of operations. For 
each operation i € ©, there is a job J;& ¥ to which it belongs, a machine M, © 
on which it requires processing, and a processing time p,; GN. There is a binary 
relation—> on © that decomposes © into chains corresponding to the jobs; more 
specifically, if ij, then J; = J, and there is no k  {i, j} withizk or kj. The 
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problem is to find a starting time S,; for cach operation i € O such that 


max 5S, + p, ; (10.1) 


(EO 


is minimized subject to 


$,20 foriEo, (10.2) 
S,-S,>p, whenever i—j,.i,jEO, (10.3) 
5,-5,2p,V5,- 5, 2p, wheneverM,=M,, i, feo. (10.4) 


The objective function (10.1) represents the schedule length, in view of (10.2). 
The conditions (10.3) are the job precedence constraints. The conditions (10.4) 
represents the machine capacity constraints, which make the problem NP-hard. 

To obtain an integer programming formulation, we choose an upper bound T 
on the optimum and introduce a 0-1 variable y,, for each ordered pair (i, j) with 
M,=M,, where y,, = 0 (y,,; = 1) corresponds to 5; — S, 2p; (S; — 5; 2 p;). We now 
replace (10.4) by 


yi {0,1}, 


Yi t¥i=l, whenever M;=M,, i, jE 0. (10.4') 
S,+ p,;—5S,~ Ty, =0, 


This formulation is closely related to the disjunctive graph. The disjunctive 
graph G = (0, A, E) has a node set ©, an arc set A = {(i, j)|i>/}, and an edge 
set E= {{i, j}|M,= M,}; note that the arcs are directed and the edges are 
undirected. A weight p, is associated with each node i. There is an obvious 
one-to-one correspondence between feasible values of the y, in 
(10.1), (10.2), (10.3), (10.4) and orientations of the edges in E for which the 
resulting digraph is acyclic. Given any such orientation, we can determine feasible 
starting times by setting each S, equal to the weight of a maximum-weight path in 
the digraph finishing at i minus p,; the objective value is equal to the maximum 
path weight in the digraph. The problem is now to find an orientation of the edges 
in E that minimizes the maximum path weight. 


10.3. Solution approaches 


Optimization algorithms for job shop scheduling proceed by branch and bound. A 
node in the search tree is usually characterized by an orientation of each edge in a 
certain subset E’ C FE. The question is then how to compute a lower bound on the 
value of all completions of this partial solution. 

A trivial lower bound is obtained by simply disregarding E\E’ and computing 
the maximum path weight in the digraph (0, AU E'). A more sophisticated 
bound is based on the relaxation of the capacity constraints of all machines except 
one: a machine M’€ is selected, and the job shop problem is solved on the 
disjunctive graph (0, AU E’, {{i, j}|M,=M,=M’'}). This reduces to a single- 
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machine problem, where the arcs in A U E’ define release and delivery times for 
the operations that are to be scheduled on M’. Although it is an NP-hard 
problem, there exist fairly efficient algorithms for its solution. The single-machine 
bound generalizes all previously proposed bounds (Lageweg et al. 1977). More 
recent (and more complicated) bounds use surrogate duality relaxation and 
polyhedral techniques. 

A variety of branching schemes to generate the search tree and elimination 
rules to truncate it is available. For this and for more information on the lower 
bounds, we refer to Lawler et al. (1993). 

Most approximation algorithms for job shop scheduling use a dispatch rule, 
which schedules the operations according to some priority function. Adams et al. 
(1988) developed a sliding bottleneck heuristic, which employs an ingenious 
combination of schedule construction and iterative improvement, guided by 
solutions to single-machine problems of the type described above. They also 
embedded this method in a second heuristic that proceeds by partial enumeration 
of the solution space. 

Van Laarhoven et al. (1992) applied the principle of simulated annealing to the 
job shop scheduling problem. This is a randomized variant of iterative improve- 
ment. It is based on local search, but accepts deteriorations with a small and 
decreasing probability in the hope of avoiding bad optima and getting settled in a 
global optimum. In the present case, the neighborhood of a schedule contains all 
schedules that can be obtained by interchanging two operations i and j for which 
M, = M, and the arc (i, j) is on a longest path. 

The computational merits of all these algorithms are accurately reflected by 
their performance on the notorious 10-job 10-machine problem instance dating 
back to 1963. 

The single-machine bound, maximized over all machines, has a value of 808. In 
1975, McMahon and Florian used the single-machine bound and a branching 
scheme that constructs all left-justified schedules to arrive at a schedule of length 
972, without proving optimality. In 1983, Fisher, Lageweg, Lenstra and Rinnooy 
Kan applied surrogate duality relaxation to find a lower bound of 813; the time 
requirements involved did not encourage them to carry on the search beyond the 
root of the tree. In 1984, Lageweg developed an improved implementation of the 
McMahon-Florian algorithm, with an adaptive search strategy, and found a 
schedule of length 930; he also computed a number of multi-machine bounds, 
ranging from a three-machine bound of 874 to a six-machine bound of 907. Two 
years later, Carlier and Pinson (1989) proved optimality of the value 930; they 
used a relaxation of the single-machine bound, a drastically different branching 
scheme, and many elimination rules. The main drawback of all these enumerative 
methods, aside from the limited problem sizes that can be handled, its their 
sensitivity to particular problem instances and even to the initial value of the 
upper bound. 

The computational experience with polyhedral techniques that has been 
reported in recent years is slightly disappointing in view of what has been 
achieved for the traveling salesman problem and the linear ordering problem. 
However, the investigations in this direction are still at an initial stage. 
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Dispatch rules show an erratic behavior.-The rule proposed by Lageweg et al. 
(1977) constructs a schedule of length 1082, and most other priority functions do 
worse. Adams et al. (1988) report that their sliding bottleneck heuristic obtains a 
schedule of length 1015 in ten CPU seconds, solving 249 single-machine problems 
on the way. Their partial enumeration procedure succeeds in finding the 
optimum, after 85{ seconds and 270 runs of the first heuristic. 

Five runs of the simulated annealing algorithm with a standard setting of the 
cooling parameters take 6000 seconds on average and procedure an average 
schedule length of 942.4, with a minimum of 937. If 6000 seconds are spent on 
deterministic neighborhood search, which accepts only true improvements, then 
more than 9000 local optima are found, the best one of which has a value of 1006. 
Five runs with a much slower cooling schedule take about 16 hours each and 
produce solution values of 930 (twice), 934, 935 and 938. In comparison to other 
approaches, simulated annealing requires unusual computation times, but it yields 
consistently good solutions with a modest amount of human implementation 
effort and relatively little insight into the combinatorial structure of the problem 
type under consideration. 


10.4. Related models 


The theory of scheduling is concerned with the optimal allocation of scarce 
resources to activities over time. It has been the subject of extensive research over 
the past decades. The emphasis has been on the investigation of deterministic 
machine scheduling problems, in which each activity requires at most one 
resource at a time, each resource can perform at most one activity at a time, and 
all problem data are known in advance. This problem class is surveyed extensively 
by Lawler et al. (1993). The results for these problems have reached the level of 
detail that a computer program is being used to maintain a record of the 
complexity status of thousands of problem types. 

We mention two natural extensions of this class that are of obvious practical 
importance. In resource-constrained project scheduling, an activity may require 
several resources to be performed and a resource may be able to handle several . 
activities simultaneously. In stochastic scheduling, some problem parameters are 
random variables. Either class has generated an impressive literature of its own; 
see Lawler et al. (1993) for references. 
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In this chapter we present three enginecring problems where combinatorial 
methods are needed for the solution. In order to emphasize the methods we (i) 
proceed in the solutions of the three problems simultaneously, and (ii) do not 
intend to present the engineering problems in their most general form. - 

Previous surveys of similar character are Iri (1983), Iri and Fujishige (1981), 
Murota (1987), Recski (1984), Roth (1981) and Sugihara (1986). These and other 
ideas in a more detailed form can be found in a monograph of Recski (1989). See 
also the forthcoming monograph of Crapo and Whiteley (1994). 

The presented applications refer to electric network theory, to statics and to 
pattern recognition. Further applications of similar combinatorial tools are also 
known in control and system theory, see Iri and Fujishige (1981) and in geodesy 
(Spriggs and Snay 1982). 


1. The problems 


An electric network is the interconnection of the following types of devices: 
(i) voltage sources whose voltage u is a given function of time and whose 
current i is arbitrary, 
(ii) current sources with i given and u arbitrary, 
(iii) linear n-ports (or multiports) with a given number 7 of node-pairs (called 
ports) so that the voltages u,,u,,...,u, and the currents i,,i,,...,i, of the 
ports are related by 


uy i, 
al |+a[@}=0, () 
u,, i, 


where A, B are real matrices of size n Xn with r(A|B) =n. 

For example, an Ohmic resistor is a f-port with u—Ri=0, or an ideal 
transformer is a 2-port with u, — ku, =0, ki, + i, =0. 

The interconnection of devices is described by a graph whose edges correspond 
to voltage or current sources, or to ports of the multiports. Figure 1.1 shows a 
fictitious network and the corresponding network graph. The symbols 1, 2 and 3 
correspond to a voltage source, to a current source and to a resistor, respectively 
while (4), (5) and (6), (7), (8) are the ports of a 2-port and a 3-port, respectively. 

The signed sum of voltages along a circuit of the network graph is zero (for 
example, u, = u, on fig. 1.1) by Kirchhoff’s voltage law. Similarly the signed sum 
of currents along a cut set is zero (Kirchhoff’s current law). A network is uniquely 
solvable if all the voltages and currents can uniquely be expressed as the functions 
of the voltages of the voltage sources and the currents of the current sources, 
using Kirchhoff’s voltage and current laws and the multiport equations of 
form (*). 
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Figure 1.1. 


Problem 1.1, Decide whether an electric network N, composed of voltage and 
current sources and multiports, has a unique solution. 


A framework is a set of rigid bars interconnected by rotatable joints. 
Intuitively, call a framework rigid if every motion of it is a congruence. (The 
exact definition of rigidity is postponed until the next section.) For example, the 
first system of fig. 1.2 is rigid, the second one is not, while the third one is rigid in 
the 2-dimensional space and nonrigid in the 3-dimensional one. 

The nonrigid examples in fig. 1.2 are “mechanisms”: Fixing appropriate joints 
(to avoid congruent motions of the whole system) some other joints can have a 
continuous motion. However, the framework of fig. 1.3 will also be considered as 
nonrigid since the solid joint has an “infinitesimal” motion even if all the other 
joints are fixed. 


Problem 1.2. Decide whether a framework F is rigid. 


How can one reconstruct a convex polyhedron if only its projection (say, from 
above) is known as a 2-dimensional drawing with straight line segments? The 


TeX 
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Figure 1.3. 


Combinatorics in electrical engineering and statics 1915 


drawing is defined as the projected picture itself plus the sets of vertices, edges 
and faces, with given lists of their incidences. 


Problem 1.3. Decide whether a drawing D in the xy-plane arises as the 2- 
dimensional projection of a 3-dimensional convex polyhedron. 


2. Are these problems linear? 


In case of Problem 1.1 the answer is clearly in the affirmative. Kirchhoff’s 
equations and the multiport equations of form (*) can be collected into a single 
system Wt=0 where t= (u,,U>,...,4,,4,...). Hence the network N is 
uniqucly solvable if and only if the above coefficient matrix W is nonsingular. 

In case of Problem 1.2 the linearity is not obvious. Let us restrict ourselves to 
the 2-dimensional case and let (x,, y;) be the coordinates of joint i. A rod between 
joints i and j means that 


Vo, = x,;)° +(y, 7 yj) =c,, (a constant) , (2.1) 


hence in case of m rods among v joints we obtain m quadratic equations among 
2n unknowns. However, the time derivative of the square of eq. (2.1) is 


(x; ~ x) ~ 4) + OY, — YO - 9) = 0 (2.2) 


and thus the collection of these equations can be written in the form Wr=0, 
where t' = (%,,%5,-..,9 1, ¥9,---) and the entries of W depend on the coordi- 
nates X,,X2,---5 Yi, Yo,--- Only, not on their time-derivatives. 

Congruent motions of the whole plane (translations, rotations) are nontrivial 
solutions of this system and form a subspace of dimension 3. Hence the m x 2n 
matrix W cannot have more than 2n —3 linearly independent columns. The 
framework is defined to be rigid if r(W) =2n —3 

In case of 3-dimensional frameworks, eq. (2.2) is replaced by 


(x; - x; 4; - 9) + (y, —¥)Oi — yt zz; PS z;) =0, (2.3) 


and rigidity means that the matrix W, now with size m x 3n, has rank 3n — 6. 
The reader may verify that the framework of fig. 1.3 as well as the ‘“‘mecha- 
nisms” of fig, 1.2 are nonrigid by this definition, justifying our previous remark on 
the “infinitesimal” motions. For the same reason, some authors call this concept 
infinitesimal rigidity; for brevity we use rigidity. 
In case of Problem 1.3 recall that a vertex (1;, y;,z;) of a polyhedron P is 
incident to a face F of P if and only if 


ax, +b,y,+2,+4,=0, (2.4) 


where the plane of F, is given by a,x + B,y + y,z + 6, = 0 (with y, 4 0 since we must 
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suppose that the plane of no face is perpendicular to the xy-plane) and a, = a,/y,, 
b, = Bly, and d; = 8)/y,. 

The coordinates (x,, y,) of the projected pictures of the vertices are known and 
the values z; must be prescribed so that the quantities 4,, b,, d; be uniquely 
determined. ‘ 


3. Examples 


Example 3.1. The describing system of equations for the network of fig. 3.1 is 


] 1 0 O O\ fu, uy 
0 O -1 Ll UO\fu, 0 
0 O -!1 0 1 i, |=] 0 
-1 0 O R, 8 i, 0 
0 -1 O O R,/ \i, 0 


and the network is uniquely solvable if and only if det W=R, +R, 40. 


Example 3.2. The describing system of equations for the first framework of fig. 
1.2 is 


xy 

x 
X,~ 7 Xz X27 X, 0 Yi ¥2 Yo V1 0 he 0 
0 - ~ 0 - “1 <10 

xX, — X3 X37%X, Yi 73 yay y 
OX, —%3 137%, 0 Wa Vy V3 Ye ey 0 

3 


and the framework is rigid if and only if r(W) = 3. One can readily verify that this 
holds if and only if the three joints are noncollinear (recall that collinearity means 


el ee yy) =0 for i, j, KE (1,2,3}, ix/#k¥i) 

Xi 7 Xu Vi Ve rae ee ee a 

Example 3.3. Consider the drawing of fig. 3.2. If the coordinates z,, Z,, 23, 24 

and z, are prescribed, they determine a polyhedron (a pyramid) with the given 
R2 Rg 


uy 


Figure 3.1. 
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(Xe, 44) 
} (¥4, Un) 


(%2, y2) (X3,43) 
Figure 3.2. 


projection if and only if 


x) yy, 2, | X, yy 2 1 
x ae Be I e z, 1 
auth 2 Pe Go nek bel oF 8 pet 
X3; yz; 2; 1 3 ¥3 23 1 
X4 Yq 2 1 Xs Ys 25 1 


since the vertices 1, 2, 3 and 4 must be coplanar and vertex 5 must not be on this 
plane. 


4. Algebraic or combinatorial solutions? 


The examples of the previous section illustrate that, at least theoretically, the 
problems have purely algebraic solutions. However, in the case of large scale 
systems, the size of the matrices will be large, hence round-off errors can arise 
during the numerical calculation. Such a ‘“‘small’”” numerical mistake can mean that 
a determinant does not vanish. 

Since our problems are qualitative (N is solvable or not, F is rigid or not, D is 
reconstructible or not), such errors are unacceptable. Hence our aim is to give 
solutions which are essentially combinatorical (with 0-1 arithmetic, free of 
round-off errors) and use as little numerical technique as possible. 

In order to do this, we must distinguish between those singularities (unsolvabili- 
ty, nonrigidity, nonreconstructability) which are caused by the structure of the 
systems alone, and those which are caused together by the structure and the 
numerical parameters of the systems. 

These distinctions will now be illustrated by a number of examples. 


Example 4.1. The network of fig. 4.1 is unsolvable. This is caused by its structure 
only, no matter what the numerical values of R,, R,, R, and @ are. 


Example 4.2. The network of fig. 4.2 is unsolvable if and only if 8 =1+R,-R,' 
holds. 
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Figure 4.2. 


Example 4.3. The planar framework of fig. 4.3 is nonrigid, independently of the 
coordinates of the six joints. 


Example 4.4, The planar framework of fig. 4.4 is nonrigid if and only if the six 
joints are on a conic section. 


Example 4.5. The drawing of fig. 4.5 cannot arise as the projection of a 
polyhedron. 


Figure 4.3. 


5 a 


Figure 4.4. 


/\ 


Figure 4.5. 
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Figure 4.6. 


Example 4.6. The drawing of fig. 4.6 is realizable if and only if the lines aa’, bb’ 
and cc’ intersect in a single common point. 


These examples illustrate that a singularity may be purely structural (Examples 
4.1, 4.3 and 4.5) or may depend on some additional algebraic relations among the 
numcrical values of the parameters (Examples 4.2, 4.4 and 4.6). If we postulate 
that no such algebraic relationship cxists (the so-called genericity assumption) 
then these problems become purely combinatorial. 

This assumption may be justified in many practical situations. For example, 
parameters of electric devices depend on technological processes (e.g., two 
different resistors cannot be exactly equal or their parameters cannot satisfy any 
other a priori given algebraic relation either). The coordinates of a vertex in a 
drawing can also be considered as approximate (they are given, say, by a light pen 
on a graphical display in a computer aided design). However, one might have 
doubts about this approach as well. For example, there are electric devices (like 
transformers, gyrators) where the parameters may be considered as approximate 
but these approximate values appear in several equations and can cancel each 
other out. An additional problem is that the definition should possibly refer to the 
devices and not to one of the possible equivalent numerical descriptions of them. 
Hence the interested reader is referred for further comments to Murota and Iri 
(1985), Recski and Iri (1980), Recski (1989) and Sugihara (1986). 


5. Solutions of the problems under the genericity assumption 


Consider the column space matroid A of the coefficient matrix of the multiport 
equations (i.e., the vectorial matroid (sce chapter 9) determined by the columns 
of the matrix); and the direct sum G of the cycle and the cocycle matroids of the 
network graph (see chapter 9). 


Theorem 5.1. The network is uniquely solvable under the genericity assumption if 
and only if the sum Av G of these matroids is the free matroid. 


Proof (sketch). If A,, A, and A, are the column space matroids of the matrices 
A,, A, and A,=(4'); respectively, then every independent subset of A, is 
independent in A, Vv A, as well (see Edmonds 1967). Hence our matroidal 
condition is always necessary. The reverse statement of Edmonds’ theorem is 
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usually not true. For example put A, =(1 2), A,=(2 4) and A,=(4 ). 
Then A,vA, is the free matroid but A, is not, due to the cancellation 
1-4—2-2=0. Such cancellations, however, are excluded by the genericity 
assumption. O 


Obviously, one can check this property by the matroid partition algorithm, see 
chapter 11. 

Consider the graph G of the framework F, with n nodes and m edges. The 
definition of rigidity clearly implies m22n—3 for planar frameworks. For 
brevity, we characterize minimally rigid planar frameworks only (where m = 20 — 
3). The general characterization can be found in Lovasz and Yemini (1982). 


Theorem 5.2. The planar framework is minimally rigid under the genericity 
assumption if and only if m=2n—3 and doubling any edge of G, the resulting 
graph can be covered by two spanning forests. 


Proof (sketch). The framework of fig. 4.3 shows that m=2n—3 cannot be 
sufficient: a condition m’ =< 2n' — 3 for every subgraph G’' of G with n’ nodes and 
m' edges is also necessary, to avoid local overbracing. A theorem of Laman 
(1970) states that m=2n—3 and this additional condition for every subgraph 
implies planar rigidity under the genericity assumption. This gives m'<2n' — 2 
for the graph G, (obtained by doubling any edge e of G), which is just the 
condition of Theorem 10.5 of chapter 9 for the cycle matroid of G,. 0 


Let F denote the set of faces of a drawing D. For every subset X C F denote by 
i(X) the number of incidences of the X-faces (i.e., 3 for triangles, 4 for 
quadrangles etc.) and by v(X) the number of vertices incident to these X-faces. 


Theorem 8.3. The drawing is a projection of a convex polyhedron under the 
genericity assumption if and only if i(X)<3|X|+v0(X)—4 for every XCF, 
|X| 22. 


This last theorem (see Sugihara 1984 for a proof) essentially asks for the 
minimization of the submodular function 3|X|+ v(X) — i(X). Due to the special 
nature of the problem it can be reduced to the constructions of matchings in 
bipartite graphs, hence its polynomial time-complexity is straightforward, without 
any reference to Grétschel et al. (1981). As we indicated, the other two theorems 
lead to matroid partition algorithms (Edmonds 1968), hence their time-complexi- 
ty is also polynomial. 


6. Various remarks 


In this last section we wish to give a few hints about the history of the problems, 
about more general results and open problems. 


ee 
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The idea of characterizing unique solvability of electric networks by a combina- 
torical condition is essentially due to Kirchoff (1847): If a network consists of 
voltage and current sources and Ohmic resistors (i.c., J-ports) only then the 
condition of Theorem 5.1 reduces to the existence of a spanning forest of the 
network graph which contains all the voltage source edges and none of the current 
source edges. Kirchhoff proved that this condition is sufficient even if the 
genericity assumption is not postulated but all resistors are positive. 

Combinatorial methods for multiport networks are. used since (Coates (1958) 
and Mayeda (1958) while minimax relations and essentially matroidal tools date 
back to Tri (1968), Kishi and Kajitani (1968), Ohtsuki ct al. (1968), Ozawa 
(1975). The idea of using the matroid sum is independently duc to Iri and 
Tomizawa (1975), Narayanan (1974) and Recski (1973). 

Our definition of multiports in section 1 was very restrictive. Network theorists 
would call our multiports linear homogencous and memoryless. However, many 
combinatorial results can be extended for wider classes of networks as well. 

The application of combinatorics in statics dates back to at Icast Maxwell 
(1864a), Bow (1873) and Cremona (1879); the graphical method they devised to 
determine the stresses in the bars (the so-called Cremona diagrams) is essentially 
an application of the duality of planar graphs. The conditions m =2n—3 and 
m'=2n' —3 for every subgraph has been known to be necessary (and not 
sufficient) for minimal planar rigidity for more than 100 years but Laman’s (1970) 
theorem seems to be the first good characterization for minimal planar rigidity 
under the gencricity assumption. Theorem 5.2 which gives a polynomial algorithm 
as well, is due to Lovasz and Yemini (1982). 

The analogous necessary condition for the 3-dimensional space is m = 3n ~ 6 
and m'<3n'—6 for the subgraphs. However, this is not sufficient under the 
genericity assumption, see fig. 6.1 (Asimow and Roth 1979). A good characteriza- 
tion for minimal 3-dimensional rigidity is an outstanding open problem of 
“structural topology”’. 

Theorem 5.3 is due to Sugihara (1986) and the reader is referred to his book 
for many related results. However, one should point out that there is a deeper 
geometric connection between Problems 1.2 and t.3, see Maxwell (1864b). 


Figure 6.1. 
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Figure 6.2. 


Theorem 6.1. A planar framework with n joints and m = 2n — 3 rods is rigid if and 
only uf no part of it is the projection of a 3-dimensional polyhedron. 


For example, the second framework of fig. 6.2 is rigid, the first one is not. This 
can be seen by denoting the intersection of lines BC and EF by H and that of CD 
and FG by I and checking whether A, H and / are collinear (see fig. 6.3). 

These and related models have actually been used by engineering communities. 
For example, a network analysis program based on Theorem 5.1 has been 
developed at the Technical University of Denmark, Lyngby (see Petersen 1979); a 
program to reconstruct polyhedra from projected images based on Theorem 5.3, 
at the Electrotechnical Laboratory in Tsukuba, Japan (see Sugihara 1984); while a 


r 


Figure 6.3. 
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dynamic process simulation system, using strongly related mathematical tools, is 


actually used in some ten chemical companies in Japan (see chapter 2 in Murota 
1987). 
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1. Introduction 


Combinatorics and physics interact in various ways. It is impossible to survey here 
all the connections. We concentrate in this chapter on statistical physics since 
several of the most basic problems in this area have a combinatorial flavour. 

Sections 2 and 3 of this chapter are concerned with two of the most 
fundamental areas, namely the. study of the Ising model and the theory of 
percolation processes. Both of these areas of research. are huge, but they share 
the common property that some of the most primitive and easily stated problems 
are, after more than thirty years of research, still largely unanswered. 

In section 4 we present, among other models, some of the classical enumeration 
problems of statistical physics; again there are many open questions and very few 
exact results. 

Sections 5 and 6 are concerned with two of the (relatively few) “techniques” 
which have been developed to deal with the sort of problems we are discussing. 
Transfer matrices and subadditive function theory are basic tools in this area of 
mathematical physics. This is illustrated by a simplified version of the dimer 
problem, it amounts to counting the number of ways of placing dominoes on a 
rectangular chessboard. 

Finally we illustrate in Section 7 the application of ideas from combinatorial 
optimization to statistical physics by showing how the problem of finding the 
ground states of a spin glass mode! may be reduced to a very basic, though 
difficult (NP-hard) problem in discrete optimization. 


2. The Ising model 


The density of water varies as a function of temperature, and generally as a 
continuous function. Of course the variation is not continuous in the neigh- 
bourhood of the boiling point, nor at the freezing point. Although we are 
accustomed to such behaviour, it is paradoxical. The forces acting between the 
individual molecules vary continuously as the temperature varies. Why then 
should there be a change of state at certain temperatures? Statistical physics is 
devoted to the attempt to understand this behaviour. 

As is customary in science and mathematics, the study begins by setting up a 
grossly simplified model. We assume that the system consists of a finite number of 
particles, and that the system is at any instant in one of a number of states. The 
behaviour of the system is governed by its Hamiltonian H, which is a function of 
the state. Its value H(c) is equal to the energy of the system in state 0. Examples 
of Hamiltonians will be given later in this section and also in section 7. The 
partition function of the system is defined to be 


Z=Z(T) => exp[—H(o)/(kT)]. (2.1) 


Here T is the temperature of the system and k is Boltzmann’s constant. If the 
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system consists of N particles, we sometimes write Z, in place of Z. It is taken as 
an axiom that all large-scale properties of the system are determined by Z. 
(Sometimes an attempt is made to disguise the fact that this is an axiom, and not 
a theorem. Also a system can have more than one partition function; the one we 
have just defined is the canonical partition function.) 

In stochastic versions of the problem it is assumed that in the stationary state 
the probability that the system is in the state o is given by 


Z| exp[~H(o)/(kT)] 
and that the free energy of the system is 
= —-kT log Z. 


The latter is a particularly important parameter of the system, and explains the 
fact that one deals with log Z as often as with Z. We are interested in the 
behaviour of Z for systems with a large number of particles, as the temperature T 
ranges over the positive real§. The value of Z depends on the number of particles 
N in the system as well as on the temperature. In all cases of interest to us, log Z 
is a linear function of N when all other parameters are fixed. The number of 
particles in any realistic physical system is, for all mathematical purposes, infinite. 
Hence we are led to study 
. | ; 
him 77 log ZAT) 
as a function of T. Following Baxter (1982, p. 14), we say that a model has been 
solved if its free energy is known. The phase transitions of the model correspond 
to the points, called critical points, at which the free energy is not an analytic 
function. 

We now consider a typical and important system, the /sing model. We are given 
a graph G =(V,E) embedded in R’® or R’. There is an atom placed at each 
vertex. Each atom has a spin associated to it, this spin takes only two values. The 
energy of the system is understood to be the sum of the energies due to the 
interaction of each pair of atoms. The contribution due to a pair of atoms will be 
assumed to depend only on whether they are adjacent in G or not. The 
interaction is completely determined by whether the given pair have the same 
spin, or not. 

The state of the system can be represented by a function o from V(G) into the 
set {-1, 1} and the Hamiltonian H(o) will be a sum over the edge sct E of G. 


Writing o; for the state of atom i, we find that the partition function of this system 
at temperature T is 


Z(G) = 2 exp|- > po,0,| . (2.2) 


The constant 6 = J/(KT) will vary inversely with the temperature and is propor- 
tional to the interaction J. 
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The graphs in which physicists are interested arc usually infinite. However, they 
are usually the limit, in a natural sense, of a sequence of finite graphs. This will be 
made clearer by the examples which follow. The most important cases of the Ising 
model are when G is either the 2-dimensional square lattice, or the 3-dimensional 
cubic lattice. The solution of the 2-dimcnsional Ising problem on the square 
lattice was a major achievement of Onsager in 1944. (For an account of this, and 
any other historical remarks in this section, see Thompson 1972.) The 3-dimen- 
sional model is still unsolved. 

There are some important extensions of the Ising model. We assumed implicitly 
that each of the two states available to an individual atom was equally likely. 
However, if there is an external magnetic ficld acting then one of the two states 
becomes more probable. The 2-dimensional Ising model has only been solved 
under the assumption that there is no external field. (The presence of an external 
field in any model is a major complication.) Another possibility is that the 
interactions between a pair of adjacent atoms may not be independent of the 
edge. (This is certainly a physically reasonable possibility.) Thus on the square 
lattice, the interactions arlsing from the horizontal edges may differ from the 
interactions on the vertical edges. Allowing for this does not usually causc 
problems; on the contrary it can even be useful, as we will see. 

A question which may well have arisen by now is, what does all this have to do 
with combinatorics? To explain this, we study the basic Ising model on the square 
lattice. Let G, denote the Cartesian product P,, x P,, of two paths with n vertices. 
Thus G,, has n’ vertices and, for large n, may be viewed as an approximation to 
the infinite square lattice. By expanding the exponential in (2.2) and since o,9; 
takes only the values +1 and —1, we have 


exp( Bo;,0,;) = cosh(B) + o,0, sinh(B) = cosh(B )(1 + 9,0, tanh(B)), 
whence the partition function for G, at temperature T can be expressed as 
2G,) =D (cosh(pyy'"" TT 1 + a0; tanh(B)). 
o fEFG,) 


With some ingenuity (sec, c.g., section 6.1 of Thompson 1972 or Biggs 1977, p. 
22) this may be rewritten as 


Z2(G,,) = 2!”o"(cosh( By!!! N()(tanh(B))' . (2.3) 


where N(/) is the number of spanning subgraphs of G,, with / edges and all 
vertices having even valency. (These are called the Eulerian subgraphs of G,,.) 
This shows that determining the partition function for the Ising model is 
equivalent to the purely combinatorial problem of enumerating the Eulerian 
subgraphs of G,,. 

It should be noted that (2.3) is valid with any graph G in place of G,. In 
particular if we replace G, by P,, then we obtain 


Z(P,,)= 2"(cosh(B))" ’. 
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From this we can deduce that 


lim (Z(P,))!"" = 2 cosh(B) . 


Since cosh(f) is an analytic function, it follows that the Ising model on the 
infinite path does not have a critical point. As we noted earlier in this section, 
Onsager showed that the Ising model on the infinite square lattice does have a 
phase transition. (See chapter 5 of Thompson 1972.) 

A short introduction to the Ising problem will be found in Cipra (1987). 


Partition functions and rank polynomials 


We now show how Fortuin and Kasteleyn (1972) demonstrated that the Ising and 
other physical problems could be related to the Whitney rank polynomial or Tutte 
polynomial (sce chapter 9 by Welsh). 

Let G be a graph, which now may have loops and multiple edges. Any subset S 
of E(G) forms a spanning subgraph of G, with the same vertex set as G, and edge 
set S. The rank of S is defined to be |V(G)|, less the number of connected 
components in the subgraph formed by S. We will denote it by r(S). The rank 
polynomial of a graph G is defined to be 


RGix, y= Sah yl, 


SCE) 


The rank polynomial has some interesting properties. If G is the disjoint union of 
graphs G, and G, then 


R(G; x, y) = R(G,; x, y)R(Gi; x, y). (2.4) 
If e€ E(G), let G\e be the graph obtained by deleting e from G, and let G/e be 


the graph obtained by contracting e (i.e., by deleting e and then identifying its 
end points). Then, if e is not a cut-edge or a loop, one can show that 


va 


R(G: x, y) = > grep elnns) p = rye 


SCE(G).eES SCE(G) e¢S 
= R(G/e; x, y) + R(G\e;x, y). (2.5) 
In the remaining cases we have 
a _ fl t+x)R(Gle; x,y), if-e is a cut-edge , 
ROR Os ee y), if eis a loop. 


Now consider the partition function for the Ising model on a graph G, which can 
be written in the form 


“G)=d TL av, 


o EEG) 


where A= exp 8. The product 0,9; is either | or ~1. Detine Ej to be the set of 
edges ij of G such that o,0,=1 and let E, be the remaining edges of G. Let 
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m=|E(G)|. Then we have 
Z(G) = > jléel-lEal = Ss jam ea | ; 


o 


If e =ij is a fixed edge in E(G), not a loop or a cut-edge, it follows that 


Z(G) = > jAmrteal 4 >, Am eel 


S =AZ(Gle) + A7'(Z(G\e) ~- Z(G/e)) 
=(A-A ')Z(Gle) +A 'Z(G\e). (2.6) 


We can now use the following theorem of Oxley and Welsh (1979). 


Theorem 2.7. Let f be a real-valued function defined on graphs which satisfies the 
recursion 


S(G) = af(G/e) + bf(G\e) 
when e is an edge of G and not a loop or cut-edge, and 


(1+x)f(G/e), if e is a cut-edge , 
KGY= ta + y)f(G\e), if e is a loop, 


where 1 +x and 1+ y are the values taken by f on a cut-edge and loop respectively. 
Then uf G has n vertices, m edges and c components, we have 


cope’ I+ t+ 
f(G) = pm "tegn R(G; ; x be 1, b y ah 1) . 


It follows that the partition function for the Ising model on a graph G is 
determined by its rank polynomial. From (2.6) we see that we can apply the 
previous theorem with Z(G) in place of f(G). Then 

a=A-aA', b=a! 
and 
l+x=Ata', Ity=a'. 


Theorem (2.7) now yields that 
2 2 2 
Z(G) =a7~™(a* — 1)" RG vr - 1) : 


A natural extension of the Ising model is to allow the spins to take more than two 
values. More preciscly, if we allow the spin at each vertex to take values from the 
set {1,2,...,q} and then define the partition function Z by Z= 
%, exp[2,, BS(g;, 0,)| where 5 is the usual delta function, we have what is 
known as the q-state Potts model. 

Using a similar argument to that just given it is casy to see that again Z satisfies 
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a contraction—deletion recurrence formula. Hence for any graph G, Z is an 
evaluation of the rank polynomial of G; though along a different curve in the 
xy-plane, namely xy = q. For a proof of this and for details of the way in which 
the percolation and ice models to be discussed below can be represented in terms 
of the rank polynomial see Welsh (1990) or the original paper of. Fortuin and 
Kasteleyn (1972). 

For excellent rigorous mathematical treatments of these topics we refer to the 
monographs of Ruelle (1969) or Thompson (1972). 


3. Percolation processes 


As its name suggests, percolation theory is concerned with flow in random media. 
Its origin in the work of Broadbent and Hammersley (1957) was as a model for 
molecules penetrating a porous solid, electrons migrating over an atomic Jattice, a 
solute diffusing through a solvent, or a disease infecting a community. 

As an example of percolation in the wider sense consider the folowing problem 
in communication theory. 


Example (Random graphs and reliability). Let N be the network shown in fig. 
3.1(a). Suppose each directed edge has probability p of being reliable, that is, 
allowing a message to pass. Suppose further that the reliability of each edge is 
independent of the reliability of any other edge. What is the probability that there 
is a path from A to B consisting only of reliable edges? 

Denoting this event by A~B, simple calculation shows that it is just the 
probability that not all the routes from A to B are unreliable. Since the routes 
have no edge in common we are dealing with independent random variables and 
we have 


PrLA~ B]=1-(l—p’). 


However, if we try the same problem for the network N’, of fig. 3.1(b) the 
problem becomes much more complicated. This is due solely to the dependence in 
N’ of the events ‘‘the route ACDB is reliable” and ‘the route ACB is reliable”’. 


This problem illustrates the intrinsic difficulty of percolation problems — sto- 
chastic dependence occurs in all but the most trivial cases and makes computation 


c 


(a) (b) 
Figure 3.1. 
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very difficult. Indeed, even with the speed of modern computing machines it is 
still impractical to determine the reliability of moderate-sized networks. In the 
language of computational complexity the problem is #P-hard (see chapter 29 by 
Shmoys and Tardos). ; 

In classical percolation theory we are concerned with the probability of infinite 
clusters in a “regular crystal lattice’. The definition of what exactly is meant by a 
“regular crystal lattice” is rather difficult to formulate precisely — it varies from 
author to author. For the purposes of this chapter it can be regarded as typified by 
the regular lattices shown in fig. 3.2, though of course the physically most 
interesting cases are when the lattice is 3-dimensional. 


Bond percolation 
Suppose now that we fix attention on the 2-dimensional square lattice, and 
suppose that there is a supply of fluid at the origin and that each edge of Y, allows 
fluid to pass along it with probability p, independently for each edge. Let P,(p) 
be the probability that fluid spreads to at least 2 vertices. Thus 

P(p)=1, 

Pi(p)= I -( — py" ? 


and in theory P,(p) can be calculated for any integer N. However, the reader will 
rapidly find it prohibitively time consuming. The case N =7 is a fair piece of 
work! Obviously, 


Py(p) 2 Pysi(P) 


and hence the limit 
P(p) = lim Py) 


exists and represents the probability that fluid spreads an infinite distance from 
the origin. 

Very little has been rigorously proved about P(p). For example, even though 
P,(p) is a polynomial in p and hence we would expect P( p) to be a continuous 
function of p, this has not yet been proved. It is clear that there exists a critical 


Square Lattice Hexagonal Triangular 


Figure 3.2. 
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probability p, such that 
p<p.>P(p)=0 and p>p.>P(p)>o0. 


However, determining the value of this critical probability is, as we will see, a 
very difficult problem. Monte Carlo simulations suggest that for all well-known 
lattices the behaviour of P(p) has roughly the same S-shaped form as shown in 
fig. 3.3. 


Atom or site percolation 


In atom percolation, instead of cach edge being randomly blocked with probabili- 
ty 1—p or open with probability p, each vertex is blocked independently with 
probability 1-—p or open with probability p. Again we are interested in the 
probability of fluid spreading locally or an infinite distance. 

Exactly analogous results hold for atom percolation as for bond percolation, 
though of course the numerical values of the critical probabilities p, and 
percolation probabilities P( p) differ. In one sense atom percolation is the more 
important since any bond percolation problem on a lattice can be turned into 
an atom percolation problem on a related lattice ¥, namely the line graph of &. 

One of the few relatively easy results which has been proved is the following 
due to Fisher (1961) and Hammersley (196la). For any regular lattice, if 
P*(p), P*(p) represent respectively the atom and bond percolation probabilities 
on the lattice then 


P’(p)<P*(p), O<p<1. 


Clearly this implies that for any lattice the critical probability for atom 
percolation is at least as big as the critical probability for bond percolation. 


The cluster problem 


An alternative approach to percolation theory is the study of the distribution of 
white and black clusters when the edges (or vertices) of a graph are independently 
painted white with probability p and black with probability g = 1—p. 


Figure 3.3. 
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Again we shall concentrate on the edge problem for the square lattice. A white 
cluster is a maximal connected subset of white edges of the lattice. The two main 
quantities of physical interest are: 

(a) the average number of white clusters; 

(b) the average number of vertices in a white cluster. 

To be more precise Ict -%, denote a square section of the square lattice 
containing m? vertices and hence 2(m-—1)° edges. If @ denotes a particular 
black/white painting of Y,, then let c,,(w) denote the number of white clusters 
and let its average value over-all paintings # be denoted by K,,(p). 

Similarly if we let the distinct clusters under w be labelled A,,.-.. Aggy. WE 
define 


5. (p)= (( IVA) +--+ [VA QI }) 


C,(@) 


where |V(A,)| denotes the number of vertices ion A,, and (-) denotes the 
expectation over all black and white paintings. Thus S,,(p) is the average number 
of vertices in a white cluster. 

Note that if isolated points are not counted as clusters then the expected 
number of clusters in this sense is given by K,,(p) — m’q* where q = | — p. This is 
because the probability that a particular vertex forms an isolated cluster is just the 
probability that the four edges incident with it are painted black, that is g’. Thus 
the average number of isolated points amongst the white clusters is m’q*. 

The average number of black clusters is obviously K,,(1 — p) and the average 
number of vertices in a black cluster is obviously S,,(1 — p). Little more is known 
theoretically about either of these functions, other than that 


K,,(p)~m’ Ap) asm>~, 


where A is an undetermined function of p. 


Roughly speaking the quantitics K,,(p) and S,,(p) are reciprocal, though 
theoretically all that has been proved is that 


S,.(p) 20 1K,,(p) . 


For p greater than the critical probability p. we have a positive probability of an 
infinite white cluster in £,. Hence, a fortiori, as p—p, the average size of a 
cluster tends to %. Numerical evidence suggests that, as p approaches p, from 
below, there exist constants C and y such that as m—, §,,(p)—> S(p) where 


S(p)~C(p.- Pp)”, 


where, moreover, y is an invariant depending only on the dimensionality of the 
lattice. 

One of the most interesting theoretical results on the cluster problem is the 
following theorem of Harris (1960). 
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Theorem 3.1. For the cluster problem on the infinite square lattice, if p is strictly 
greater than the critical probability then, with probability one, the set of white edges 
contains only one infinite component. 


Extensions of this to higher dimensions can be found in Grimmett (1989). 


Determining the critical probability 


The problem of finding critical probabilities for particular lattices, first posed in 
1957, has received great attention, but is still proving to be a remarkably difficult 
problem. A vast amount of numerical estimation (based on Monte Carlo 
methods, Padé approximations and the like) has been carried out, so good 
numerical estimates exist for most of the 2- and 3-dimensional lattices. 

Theoretically much less is known. A landmark in the study of critical 
probabilitics was the paper by Sykes and Essam (1964) which, though unrigorous, 
gave convincing arguments for believing that for bond percolation on a planar 
lattice the critical probabilities were related by 


PAP) + p(P*)=1. (3.2) 


where &* is the planar dual of Y. 
An obvious consequence of this is the following. 


Theorem 3.3. For bond percolation on the square lattice, the critical probability is 
L 


3° 


This result was finally proved by Kesten (1980) by a series of ingenious 
arguments which have led to a rigorous proof by Wierman (1981) of the following 
result, again first shown unrigorously by Sykes and Essam (1964). 


Theorem 3.4. For bond percolation on the 2-dimensional triangular lattice (T) and 
the hexagonal lattice (H), 


p(T) = 2 sin(a/18) , 
p.(H) =1—2sin(w/18). 


However, all of these arguments are very much restricted to 2 dimensions. A 
fundamental and very difficult problem is the following. 


tem 3.5, Determine the critical probability of bond or site percolation on the 
nsional cubic lattice. 


- 2-dimensional planar lattices there are many open problems. For 
following is known from Toth (1985) and Zuev (1987). 
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Theorem 3.6. The critical probability of site percolation on the square lattice is 
between 0.5095 and 0.68189. 


However, this is a very wide range and we pose the following. 


Problem 3.7. What is the critical probability for site percolation on the square 
lattice? 


First passage percolation 


This originated in the paper of Hammersley and Welsh (1965) as a model for a 
“time dependent” percolation process. It contains ordinary percolation as a 
special case and in its most general sense can be regarded as a randomized version 
of the shortest route problem in graphs. 

Consider the square lattice in which cach edge, is, independently, assigned a 
non-negative random length drawn from a known probability distribution F. 

Let ¢, be the random first passage (shortest) path length from the origin to 
(n, 0) in this lattice and let r(#) be its expected value over all possible states (that 
is distribution of lengths). The fundamental observations are that for m,n € Z 


t(m +n) <7(m) + (n) (3.8) 


so that by the theory of subadditive functions 


him 


ne 


exists. 

The time constant » depends only on the distribution F and is, like the critical 
probability of ordinary percolation, a not very well-understood lattice invariant. 
For example when the lengths are uniformly distributed in [0, 1] it is known from 
Monte Carlo studies that 4 = 0.323, but its exact evaluation seems a hopelessly 
intractable problem. 

Apart from its intrinsic interest, first passage percolation led Hammerstcy and 
Welsh (1965) to set up a theory of subadditive stochastic processes which are now 
a fundamental tool in probability and probabilistic combinatorics, see for example 
Kingman (1973). 


Correlated percolation 


In an ideal world one would like to be able to remove the restriction that the 
random component associated with an edge in each of the above models was 
independent of all other edges. This is the subject of correlated percolation which 
is now a topic of considerable interest in the physical literature but for which 
(understandably) there are very few theoretical results. 

For comprehensive rigorous accounts of what is now a huge area of research in 
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statistical physics we refer to the monographs of Smythe and Wierman (1978), 
Kesten (1982) and Grimmett (1989). 


4, Enumeration and related problems 


Several fundamental problems in statistical physics and related areas of the 
natural sciences reduce to enumerating structures of different types. In this 
section we discuss a few of the most studied and basic problems of this nature. 


Animals or polyominoes 


Consider the 2-dimensional square lattice %, with origin 0. An animal or 
polyomino of n cells is a connected subgraph of 4, containing 0 and having n 
vertices. Let a(n) denote the number of distinct animals having n cells. Then 
clearly a(1) = 1, a(2) = 4 and counting the 3-celled animal types illustrated in fig. 
4.1, we see a(3) = 18. 

The fundamental problem which is now at least 30 years old is to determine the 
form of a(n) for the different lattices. However, as with percolation theory, 
rigorous exact results about animals are pretty scarce. 

First we will point out the connection between animals and percolation theory. 
Suppose that we could determine a(n, b), the number of animals with n cells and 
b boundary cells. (As the name suggests a cell is a boundary cell of a specific 
animal A if it is a vertex of “& which is not in A but is adjacent to a vertex of A.) 
Then from a(n, 6) it is not difficult to compute the average cluster size in a 
percolation model. From this we get good bounds for the critical probability. 


Other applications of animals are to growth problems and as models of 
branched polymers with excluded volume. 


Now let us turn to some basic results about a(n) for the square lattice. A 
straightforward counting argument gives 


2" < a(n) <(6.25)". : (4.1) 
It is also easy to prove that, for any positive integers m,n, 
a(m + n) 2 a(m)a(n) . (4.2) 


Proof: Each animal has a top right corner and a bottom left corner. By “sticking” 


Eb Aooeco eH 
(4) (4) (4) 
(2) 
Ela 


Figure 4.1. 


Combinatorics in statistical physics 1939 


the bottom left corner of an n-celled animal to the top right corner of an m-celled 
animal we obtain an (m + n)-celled animal. 

By the basic property of subadditive functions, (4.1) and (4.2) give the 
fundamental result which holds (by analogous argument) for any regular lattice. 


Theorem 4.3. For any lattice £ there exists a constant a(L) such that if a(n) 
denotes the number of n-celled animals of £ then . 


lim a(n)!" =sup [a(n)|'’" = a(P). 


Determining the limiting constant a(¥) exactly seems to be very difficult and 
even the best known bounds are not very tight. For example, in the most studied 
case of the square lattice, concatenation methods coupled with computer counts 
give the best known lower bound of 3.79 for a(-¥%,) while the best upper bound 
gives a(-£,) <4.65. There are reasons for believing that a(/) is just above 4 in the 
case of this lattice. 

For more details on these methods, the corresponding results for other lattices 
and a discussion of related problems we refer to a recent excellent review of 
Whittington and Soteros (1990). 


Self-avoiding walks and polygons 


Another counting problem closely connected with percolation theory and similar 
in spirit to the animal problem of the previous section is the following. A 
self-avoiding walk of length n on a lattice & is a path of m edges which has one 
endpoint at the origin. If f(m) denotes the number of such self-avoiding walks, on 
the square lattice then clearly f(1) = 4, f(2) = 12 and in general it is casy to show 
that 


2” <f(n) <4.3"'~” (4.4) 
Using the submultiplicative property 

fim +n) <femf) 
and a similar bound to (4.4) for a general lattice, leads to the fundamental result. 
Theorem 4.5. For any lattice L, there exists a constant w = p(L) (known as the 


connective constant) such that if f,{(n) denotes the number of self-avoiding walks of 
length n on £ then 


lim f(n)'"" =inf[ f(a) |" = w(L) - 


Determining pw exactly for any lattice except the regular tree has been a much 
studied problem since it was first posed in 1957, Even good bounds seem to be 
difficult to obtain. For example, for the 2-dimensional square lattice, the best 
bounds so far known are 2.57% ww = 2.73. 
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A closely related quantity is g,() which counts the number of self-avoiding 
polygons of 2 steps. Clearly g(a) <f,(n) but Hammersley (1961b) proved that 
for any lattice with connective constant p(Z), 


lim g,(n)""" = w(Z). 


There are physical reasons (based on renormalisation arguments) and some 
numerical evidence which support the intriguing conjecture that there exist 
constants @ and £ such that 


finy~n“p", gp ny~n'y" 


and that a, B are dimensional invariants, in other words they only depend on the 
dimension of the lattice. Since Kesten (1963), there has been a great deal of 
rigorous mathematical progress notably by Hara and Slade (1992). For more 
details see Madras and Slade (1993). 


The tce problem 


As its name suggests the “ice problem” originates in the statistical physics 
associated with models used to calculate the residual entropy of “square ice”’. In 
its most general form an ice model specifies a set of allowable configurations at 
each vertex. All allowable configurations are of equal thermodynamic weight and 
the problem reduces to calculating the partition function, that is, enumerating the 
number of allowable configurations. 

Probably the most studied ice problem is the following. Given any 4-regular 
graph G count the number of orientations w of G which have the property that at 
each vertex there are exactly two inward and two outward pointing edges. 

Another way of looking at this enumeration problem is as follows. Fix an 
orientation w, of G. To each directed edge of G assign either a +1 or a —1 in 
such a way that the net flow into each vertex of G is zero. In other words, the ice 
problem on G is exactly the problem of counting nowhere-zero flows mod 3 in G, 
discussed in chapters 4 (Appendix) and 9. But this is exactly the evaluation of the 
Tutte polynomial of G at the point (0,—2), or the rank polynomial of G at 
(—1t, -3). 

Equivalently, by using the fact that when G is a planar graph, and G® is its 
planar dual, 7(G; x, y) equals T(G*; y, x) and from the relation between Tutte 
polynomials and chromatic polynomials we see the following. 


Proposition 4.6. The ice problem on a 4-regular planar graph G is equivalent to 
counting 3-colourings of the dual graph. 


A remarkable result of Lieb (1967) is that if Z(m, n) denotes the ice partition 
function (that is the number of ice orientations) on the mt Xn portion of the 
square lattice, then 


im (ZC, ny" = GY? : (4.7) 
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Percus (1971) gives a very complete and clear account of the different approaches 
to the ice problem culminating in a proof of (4.7) by the transfer matrix method 
to be described in section 5. 

As far as statistical physics is concerned, the problems of most interest are 
when G is a 3-dimensional lattice, As far as. mathematical solution is concerned, 
only a few 2-dimensional ice models have been solved, a comprehensive account 
of these is given in Baxter (1982). 


The monomer—dimer problem 


Let p(G,k) denote the number of k-matchings in the graph G, with the 
understanding that p(G, 0) = 1. Define the polynomial Q(G, z) by 


Q(G, z) = = p(G, k)z"** 


This is a modified form of the matchings polynomial, which is discussed in section 
5 of chapter 31 (Godsil). It can also be viewed as the partition function of a 
physical system. 

Consider a collection of sites on the surfaces of a metallic crystal. The surface is 
exposed to a gas consisting of monomers and dimers, e.g., hydrogen at a high 
temperature. Each site on the surface is occupied, either by a monomer or by one 
of the two ends of a dimer. Of course a pair of sites can be occupied by a dimer 
only if they are neighbours. The state of the system can be represented by a 
matching in a graph G. This has the crystal sites as its vertices, with two sites 
adjacent if and only if they can be occupied by the same dimer. Those pairs of 
sites occupied by dimers determine a matching. Hence the system is completely 
described by the graph G, the matching and the temperature. (The latter 
determines the energy gained by filling the crystal sites with monomers and 
dimers.) 

The physical question is whether this system will undergo a phase transition as 
the temperature varies. In fact it does not, except possibly when there are no 
monomers. This was proved by Heilmann and Lieb (1972). They showed that all 
zeros of Q(G, z) have zero real part, and their absolute value is bounded above 
by 2VA— 1, where A is the maximum valency of a vertex in G. From these facts 
they eventually deduce the absence of a phase transition. The matchings 
polynomial has a number of interesting combinatorial properties; see chapter 3 
(Pulleyblank). 


Hard hexagons 


We work on the triangular lattice. Consider a system where some of the vertices 
of this lattice are covered by hexagons, with each hexagon covering a central 
vertex and its six neighbours. Two adjacent vertices cannot be both at the centre 
of a hexagon. We can describe the state o of the system by assigning a weight | to 
each vertex at the centre of a hexagon, and 0 to the remaining vertices. Thus we 
may view o as a Ol-vector indexed by the vertices of the lattice. The partition 
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function is ms 


Z= D2 T] -a9,), 
o jek 

where the product is over all cdges of the lattice, and the exponent of z is just the 
number of hexagons. Baxter establishes an invariance property of this partition 
function using the star-triangle relation. From this he then deduces the free 
energy. One surprise is an intimate connection with the Rogers-Ramanujan 
identities. We direct the reader to Andrews (1982) and Andrews et al. (1984) for 
more information about this relationship. 


5. Transfer matrices 


Many of the combinatorial problems arlsing in statistical physics can be reduced 
to enumeration problems, and these in turn can sometimes be solved by the 
method of transfer matrices, which we now discuss. 

We begin with the problem of determining the number of ways an m Xn 
chessboard can be covered with dominoes. Suppose that our board has been 
covered with dominoes. The given covering can be encoded by assigning one of 
four states {U,D,L,R} to cach square of the board. The state of a square 
determines where the other half of the domino covering the square lies. Thus, if 
the other half of the domino covering a square is above it, then the square has 
state U. If it is below we use D, and if it is to the left or the right we use L or R 
respectively. It should be clear that many assignments of states to squares do not 
correspond to coverings, but every covering gives rise to a unique assignment of 
States to squares. 

Now we take our coding a step further. View our chessboard as a sequence of 
columns. Once a covering is given, the state of the vertices in a given column can 
be represented by a vector, with states as entries. (Of course, the set of possible 
vectors for the first and last columns will be a subset of the possible vectors for an 
interior column.) Thus our covering can now be encoded as a sequence 


Tivo ys 


where go; is the state vector for the ith column. 

Let = be the set of all possible state vectors for a column. Construct a graph 
G = G(%) with vertex set 3, and with two vertices o and a’ adjacent if there is a 
covering such that there are consecutive columns with states o and a’. The states 
that can be taken by the first column form a subset, S say, of V(G) and the states 
available to the last column form a subset F, say. The number of possible 
coverings of our m Xn chessboard can now be shown to equal the number of 
walks of length n in G which start at a vertex in S and finish at a vertex in F. (Our 
terminology here follows that used in section 5 of chapter 31 by Godsil). 

If A is the adjacency matrix of G and we denote the characteristic vectors of 
the sets S and F by y(S) and y(F) respectively then the required number of walks 
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MS) AXE). 


Using the theory of the spectral decomposition of a symmetric matrix we may 
write 


A" = i a"Z, ’ 
0 


where @ ranges over the distinct cigenvalues of A and Z, is the matrix 
representing orthogonal projection onto the cigenspace associated to 0. Denote 
the largest cigenvalue of A by @,. Then by the Perron—Frobenius theory we know 
that if G is connected then @, is simple, and for any other eigenvalue 9, we have 
la|<6,. It follows that 


x(S)'A"X(F) 

aon x(S)'Z,, x(F) 
1 

and hence that 


(x(S)TA"X(F))" > 8, (5.1) 


as n tends to infinity. 

The number of domino coverings of our chessboard can be expressed as the 
number of perfect matchings in a graph, // say. The vertices of // arc the squares 
of the chessboard, and two squares are adjacent in H if and only if they are 
adjacent on the board. Any domino covering gives a perfect matching in the 
graph. A gencralisation of the original problem can now be obtained as follows. 
Assign a weight to cach edge of H and define the weight of a matching to be the 
product of the weights of the edges it uses. Instead of simply computing the 
number of perfect matchings in H, we may determine the sum of the weights of 
all perfect matchings. The weights we use may be variables, in which case the sum 
will be a polynomial. (For example we might assign a weight @ to each edge of 7 
joining two squares in the same column of the board, and a weight B to the 
remaining edges. The sum will then be a homogeneous polynomial in a and B.) 

In particular, the partition function for the Ising problem itself can be expressed 
in terms of the number of perfect matchings in an edge-weighted planar graph. 
(See Appendix E in Thompson 1972.) If we then seek to determine this partition 
function by a transfer matrix argument, we will find it expressed in the form 


Z=u'A"v 


for a suitable matrix A and vectors u and v. A statistical physicist would then be 
concerned with propertics of the limit 


(u'A’v) Yin 


as n tends to infinity. From the discussion above of the domino problem, we may 
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see that this quantity may be expressed as the largest eigenvalue of a symmetric 
matrix. Alternatively, we could use the generating function 

ae A "uA"; 

n=0 
the largest cigenvaluc of A is, in general, the reciprocal of the radius of 
convergence of this power serics. 

To close this section, we remark that a solution to the chessboard: problem will 
be found in section 4 of Lovasz (1979). (However, it uses Pfaffians rather than 
transfer matrices. Pfaffians are discussed briefly in section 5 of chapter 31 by 
Godsil.) A more Icisurely introduction to the method of transfer matrices may be 


found in Percus (1971) and Stanley (1986). A number of applications of this 
method can be found in Baxter (1982). 


6. Duality, stars and triangles 


In this section we shall illustrate by example a technique which has been 
frequently used to resolve (combinatorial) problems of physics. The partition 
function for the Ising model has two interesting invariance properties. First, if G 
is a plane graph with dual G* then their rank polynomials are related by 


R(G; x, y) = R(G*; y, x). 
A proof of this will be found in chapter 9 by Welsh. If 


P 
+] 1/2 
a= (EO) 
lio | 


then 


RG: -1)=R(G; | z ) R(o*; is 2 1) 
bay Ween 2 wd Pg taaapre e 


Recalling the relation of the rank polynomial to the partition function described 
in section 2, this Icads to a relation between the partition function for the Ising 
model on the graph G, expressed in terms of A, and the partition function of G*, 
expressed in terms of [(u” + 1)/(u” — 1)]'. If G is the infinite square lattice then 
G* is isomorpk’c to G and the duality relation becomes an invariance condition. 
Physie-" ‘n be viewed as a relation between the values of the partition 
: temperatures and low temperatures. For more details, see 
or Baxter (1982). 
\riance property, we need to consider a generalisation of the 
\rtition function 


| =< B,7.,| . 


EEG 
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Recall from section 2 that in the case of uniform interactions B = J/kT, here we 
allow all the interactions J to vary to that B also varics. 

By way of example, if G is the square lattice we might have B, = K for all 
horizontal edges and B,,= L for all vertical edges. If e = ij and A, = exp f,, then 
we find in place of (2.6) that 


Z(G) =(A,-—A,')Z(Gle) + AZ'Z(G\e) . 


Consider the star § and triangle T, with weights as indicated in fig. 6.1. 

Suppose that in both cases the vertices 1, 2 and 3 arc assigned states o,, a, and 
o,. Suppose that the vertices 1, 2 and 3 in S are part of some larger graph G. 
Then Z(G) is a sum over all possible state assignments of-its vertices. For a given 
state, the corresponding term in the sum is the product of a contribution from the 
edges of S, and one from the edges not in S. The contribution from S is 


exp{o,(L,o, + L,o, + L,o,)]. 


We divide the possible states into pairs, where members of the same pair differ 
only in the value of oj. Thus we may write Z(G) in the form 


2 2 cosh(L,o, + La, + L,0,)f(o) , — (6.1) 


where f(a) is the contribution of the edges not in S, given the values of o on the 
vertices of G\0. 


Now suppose that we alter G by deleting the vertex 0 and the three edges of S, 
replacing them with the three edges of T. The new graph, which we denote by G’, 
thus has one less vertex than G. Its partition function can be written in the form 


> exp(K,o,0, + K,0,0, + K,o,0,)f(o) . (6.2) 


Then the surprise is, that given L,, L, and L,, it is possible to choose K,, K, and 
K, so that 


2 cosh(L,a, + L,0, + L,0,) = R exp(K,a,0, + K,0,0, + K,0,0,) 


Figure 6.1. 


~ 


1946 C.D. Godsil et al. 


and then Z(G) = RZ(G'). To achieve this we need 


2cosh(L, + L,+ L,) = Rexp(K, + K, + K;), 

2cosh(—L, + L, + L,) = Rexp(K, — K,—K;), 
2cosh(L, — L, + L;)= Rexp(-—K, + K,-— K;), 
2cosh(L, + L, — L,) = Rexp(-K, — K, + K;). 


Denote the four terms on the left by c, c,, c, and c, respectively. Then 
multiplying these four equations together yields that 


R* =ce,c.c, 
is a necessary condition. Further manipulations yield 

sinh(2K,) sinh(2L, ) = sinh(2K,) sinh(2L,) = sinh(2K,) sinh(2L,) =d- ae 
where 


fs sinh(2L,) sinh(2L,) sinh(2L,) 


B(ce.¢5c5) , 


as a second necessary condition. If the values of R and K; are as given by the last 
three equations then Z(G) = RZ(G’). (For help with the missing details, see 
chapter 6 of Baxter 1982.) 

If G is the hexagonal lattice then it can be transformed into a triangular lattice 
by repeatedly replacing stars by triangles. (The hexagonal lattice is bipartite; 
replace all the stars centered on vertices in one of the two colour classes.) This 
gives us a relation between an Ising model on the hexagonal lattice and one on 
the triangular lattice. We obtain a second, independent, relation by recognising 
that the hexagonal lattice is the planar dual of the triangular lattice. Composing 
these relations yields an expression for the partition function of an Ising model at 
low temperature in terms of a partition function at high temperature, for both the 
triangular and hexagonal lattices. This is an analogue of the relation obtained for 
the square lattice above. (Again, sce Baxter 1982 for more detail.) 

Other applications of this star—triangle transformation, which is really a special 
instance of planar duality theory have been in percolation theory, to the Potts 
model and to the six and eight vertex ice model. More details may be found in 
Tempertey (1981). 


7. Ground states of spin glasses 


We will now turn to a different application of combinatorics in statistical physics 
and outline how some questions about spin glasses can be answered by employing 
the mathematical machinery of combinatorial optimization. We concentrate on 
showing that the problem of determining ground states of spin glasses can be 
viewed as a so-called maximum cut problem in graphs. 
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The study of order—disorder phenomena is a flourishing branch in today’s 
physics. One of the most successful attempts to understand disorderly systems has 
been the study of spin glasses. They occupy a central position in this area. The 
composition of a spin glass is unremarkable — perhaps a few iron atoms scattered 
in a lattice of copper atoms—but its magnetic properties are confoundedly 
complicated and sometimes tantalizingly unpredictable. “Spin” is the quantum- 
mechanical spin from which magnetic effects arise; ‘‘glass”’ refers to the disorder 
in the orientations and interactions of the spins. For an introduction to the 
general theory of spin glass models see Mezard et al. (1987). 

Physicists have developed a number of theories to model spin glasses and 
explain their behaviour. Some of these theories predict contradicting phenomena. 
These phenomena occur in situations which are hard to realize experimentally. In 
order to test the theories and guide the design of experiments, researchers have 
designed computer models to simulate the behaviour of spin glasses and then 
observe which phenomena occur. Some aspects studied in these models lead to 
optimization problems. The papers by Toulouse (1977), Bieche et al. (1980), and 
Barahona et al. (1982) have pioneered the study of spin glasses from an 
optimization point of view and pointed out the close links of the ground state 
problem to interesting models in combinatorial optimization. 

A spin glass is an alloy of magnetic impurities diluted in a non-magnetic 
material. Alloys that show spin glass- behaviour are, for instance, CuMn; the 
metallic crystal AuFe; the insulator EuSrS; the amorphous metal GdAl. One 
characteristic of spin glasses is a peak in magnetic susceptibility at a certain 
temperature. This peak indicates a phase transition. Another phase transition 
may take place at very low temperature. However, it is an open question at 
present whether, or under what conditions on the spin glass, such a phase 
transition occurs, and what order phenomena show up at low temperature. 

We will now present a mathematical model of spin glasses. We assume a given 
spin glass that contains n magnetic impurities (atoms). Each magnetic atom i has 
a magnetic orientation (spin) which is represented by a 3-dimensional unit-length 
vector S;. Between each pair i, j of magnetic atoms there is an interaction J, that 
depends on the non-magnetic material and on the distance r,, between the atoms. 
Several proposals in the literature model this interaction. One common feature of 
many of these models is that the absolute value of the interaction decreases 
rapidly with distance and that small changes of distance may result in a change of 
the sign of the interaction. One example of such an interaction function which is 
used frequently is 


cos(Dr;,) 
jij = Jr) =A Bers ’ 


where A, B, and D are material-dependent constants. In another model some 
number J is chosen and the interactions have to satisfy 


J, E{0, +4, J). 
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If atoms i and j have spins S, and S,, the energy interaction between i and j is 
given by 


H,, = 5,8; S, , 
where S,-S, denotes the Euclidean inner product. Given a spin configuration or 


state, o, the energy of the whole system is given by the Hamiltonian 


av a" u 
H(o)=-> Dd J,8,-S-hDs,-F, 
eres! i=1 
where a unit length vector F ER represents the orientation of an exterior 
magnetic field and h represents the strength of this field. 

The study of this Hamiltonian is a major issue in statistical physics. Its difficulty 
has led to considering various simplifications. One such simplification is to replace 
the 3-dimensional vectors $, and the magnetic field F by 1-dimensional vectors g,, 
respectively f, with values +1 or —1 (called ‘“‘/sing spin’), meaning magnetic 
north pole “up” and magnetic north pole “‘down’’, Such a representation is called 
the Ising model of spin glasses, see section 2 of this chapter for a general 
introduction to this model and some of its properties. There are, in fact, 
substances which show an up/down behaviour and for which the {sing model 
seems to be the “correct” model and not just a simplification. 

Models that consider interactions between all pairs of impurities were intro- 
duced by Sherrington and Kirkpatrick and are called long range models. A 
number of models consider only interactions between ‘‘close” impurities (so- 
called nearest neighbour interactions), and set to zero the interactions between 
impurities that are far apart. These models, introduced by Edwards and Ander- 
son, are called short range models. Many physicists consider short range models 
more realistic (see Young 1984). Moreover, a number of substances show short 
range interactions only: next-neighbour and second-next-neighbour, say. 

It is customary to make further simplifications and to consider the spins 
regularly distributed, say on a 2- or 3-dimensional grid (that is square or cubic 
lattice). In a typical short range model of such a given structure, interactions are 
non-zero only along edges of the grid, so, for instance, in two-space, an impurity 
interacts only with (at most) four other impurities, its neighbours in the grid 
graph. Two grid models of this type have been studied intensively: the Gaussian 
model, where the interactions are chosen from a Gaussian distribution, and the 
+J-model, where the interactions between impurities take only the values +J or 
~J,J a fixed positive number, according to some probability distribution. In a 
real spin glass (an alloy), the magnetic impurities are randomly distributed. Note 
that in the models just introduced, the spins are regularly distributed in a grid, but 
the interaction values are considered random. 

The partition function of the Ising model has been introduced in section 2. Kor 
our purposes, it is useful to write it in the following way. Let 2 be the set of all 
possible configurations of Ising spins on a grid, say. So |Q|= 2", if there are n 
spins. Then the behaviour of the system at temperature T is believed to be 
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described by the magnetic partition function 


2iny= & eel ee) 


gE 


where k is the Boltzmann constant. As mentioned in section 2, analytic 
expressions of the partition function, in general, are not known. 

At 0°K, the spin glass system attains a minimum energy configuration. Such a 
configuration is called a ground state. A ground state can be found by minimizing 
the Hamiltonian associated with the system. We will now present the reduction of 
the problem of finding a minimum energy spin configuration in the Ising model to 
a max-cut problem in graphs. 

Suppose we have magnetic impurities 1,2,..., and an exterior magnetic 
field, 0. We set V= {0,1,...,n} and consider V as the vertex set of a graph 
G —(V, £). For a pair i, j of impurities, G contains an edge i if the interaction J, 
between i and j is non-zero. An edge Oi links every impurity i, 1 <i<n, to the 
magnetic field 0. Let us call G the interaction graph of the spin system. An Ising 
spin o, E {—1, +1} is associated with each impurity. The Ising spin o, of the 
exterior magnetic field can be set to +1 without loss of generality. Let A be the 
strength of the magnetic field and set J,, =A for i=1,...,n, then we can write 
the Hamiltonian of this model as a quadratic function in +1-variables in the 
following way: 


H(o)=—- > J joj, —h BS o, = > J,G,, - 
HEE be EE 
ig-0 
Each spin configuration o corresponds to a partition of V into V* and V, where 
V*={iEV|o,=+1} and V” = {iEV|a, = —1}. So we can write the energy of 
the state o in the form 


7! .. 
H@)=- XY dywoo- 2 Io, 2 Ie, 
EE") HEE(V ) es(v'*) 
a | Th 
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Recall that, for each subset W of V, E(W) = {if © E|i, 7 EW} and 6(W) = {FE 
E|iGW, j€V\W} and that the edge sets of type 5(W) are called cuts. Setting 


C= Mee J,;, we see that 


H(a)+C=2 XD J, 


wea(v ty 


and defining ¢, = —J, for all ij € £, we find that the problem of minimizing ff is 
equivalent to maximizing 


(SV')= Dc 
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over all V' CV. The problem of finding, given a graph with edge weights, a cut 
5(W) such that the sum of weights of the edges of 6(W) is as large as possible is 
known as the max-cut problem. Thus finding a ground state in the Ising model of 
a spin glass is equivalent to finding an optimum solution of the corresponding 
max-cut problem. 

To determine ground states of spin glasses or, equivalently, cuts of maximum 
weight, physicists have introduced the so-called simulated annealing method. This 
is an algorithmic analogue of standard techniques in the material sciences where, 
for instance, large (and perfect) crystals are grown by using a careful scheme of 
cooling and heating the matcrial to temperatures very close below and above the 
critical temperature where the liquid freezes into an ordered array of atoms, the 
crystal. This method was formulated as a gencral heuristic for the solution of 
arbitrary combinatorial optimization problems, sce, for example, Kirkpatrick et 
al. (1983), and usually turned out to be a reasonable, though slow, approximation 
algorithm, see Johnson ct al. (1989, 1991). 

It soon became clear that, by simulated annealing, states of low energy can be 
reached but that there is no guarantce or proof that a true ground state can be 
found. Thus more sophisticated combinatorial methods came into play that we 
briefly want to mention. More detailed and thorough surveys of these aspects with 
large lists of references are Barahona et al. (1988), Grétschel et al. (1987). 

From the complexity point of view (cf. chapter 29 by Shmoys and Tardos) it 
turned out that the max-cut problem is NP-hard for general graphs, and so the 
spin glass problem is. But much more restricted spin systems turned out to lead to 
hard optimization models. For instance, finding a ground state is NP-hard even if 
the interaction graph of the spin system is a 3-dimensional grid, or a 3-dimension- 
al prid with just two layers, or even a planar grid with an external magnetic field, 
provided the interactions are taken from {—J,0, J}. 

On the other hand, if the interaction graph of the spin system is a planar graph 
and there is no external ficld then using the duality theory of planar graphs one 
can transform the associated max-cut problem to a so-called Chinese postman 
problem. This problem can be solved by the algorithm of Edmonds and Johnson 
(1973) which combines a series of shortest path calculations and an application of 
the matching algorithm in an ingenious way. Barahona (1983) extended this to 
graphs not contractible to the complete graph K,. Using the Edmonds—Johnson 
algorithm, ground states of large planar spin systems can (and have been 
frequently) calculated easily. 

The most interesting open questions about spin glasses occur, however, in three 
dimensions or when an external magnetic ficld is involved. To solve such 
(NP-hard) instances various enumeration techniques (e.g., the transfer matrix 
method) have been designed. The most successful approach seems to be the use 
of cutting plane algorithms (cf. chapter 37) that are based on an intensive study of 
the so-called cut polytope. This approach is called polyhedral combinatorics and is 
explained in chapter 30. With these linear programming based cutting plane 
algorithms, spin systems in three dimensions or two dimensions with magnetic 
field can be treated that have well over thousand spins. Algorithms of this type 
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terminate with an optimality guarantee, that is, true ground states can be found; 
and they have further desirable features. But, of course, no polynomial running 
time guarantee can be given. For more information, see Barahona et al. (1988) 
and Grotschel et al. (1987), in particular, for a list of open problems in physics 
that may reach a better level of understanding by a systematic and intensive use of 
the combinatorial methods outlined above. A collection of papers and surveys on 


various aspects of spin glasses (including the ones discussed here) is Van Hemmen 
and Morgenstern (1987). 


8. Conclusion 


This has been just a glimpse of a fascinating but very difficult area of research. 
For example there is no mention of the important topic of cluster expansions. 
Details of this and many other links between combinatorics and physics may be 
found in the articles of Kasteleyn (1967) and Temperley (1979a). Almost all the 
problems discussed are probably hard in the scnse of computational complexity, 
except for the very restricted cases. The role of planarity appears to be significant 
in making a problem easier, see for example Kasteleyn (1961). For more on the 
complexity of these physical problems see the monograph of Welsh (1993). 

As far as solution is concerned, apart from the approach suggested in section 7, 
the most significant theoretical advance would appear to be the results of Jerrum 
and Sinclair (1993) that the monomer-dimer problem and the ferromagnetic 
version of the Ising model have a fully polynomial randomised approximation 
scheme. Put more loosely, this says that there are fast (in the sense of polynomial 
time) good Monte Carlo methods for these problems. Whether such schemes exist 
for the other problems discussed in the chapter is an important but as yet 
unanswered question. 
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1. General introduction 


The science of chemistry concerns itself basically with the study of molecules, 
imperceptibly small architectural edifices. of atoms held together by chemical 
bonds. There are several different types of bond ranging from the very strong to 
the very weak. Much of the work of chemists is involved with the making or 
breaking of chemical bonds in molecular structures. Whenever such bonds are 
formed or eliminated, the molecule in question is said to undergo a chemical 
transformation. To an ever increasing extent, chemists are finding it convenient to 
represent both molecules themselves and the various transformations they 
undergo by means of chemical graphs. In the case of molecules, the graphs are 
commonly referred to as molecular graphs; those used to represent chemical 
transformations are called reaction graphs. Because all chemical systems, includ- 
ing individual molecules, possess both metric (geometric) and nonmetric (topo- 
logical) attributes, graphs alone cannot reflect the totality of their chemical 
behavior. However, graphs are especially useful for characterizing the nonmetric 
attributes of molecules, which are today generally recognized as being of at least 
equal importance to the metric ones. 

The exploitation of graphs in a chemical context brings in its wake a number 
of interesting consequences, some of which are worthy of brief consideration 
here. 

(i) Certain of the results obtained by graph theorists and combinatorialists are 
directly applicable to the solution of chemical problems. An example of this is 
provided by the extensive use of Polya’s enumeration theorem for the enumera- 
tion of many different types of chemical isomers (see section 3). 

(ii) Specific problems encountered by chemists who make use of graphs may 
present new challenges to the mathematical community. As an example, we cite 
the need to examine and classify the more than 150 different scalar numerical 
graph invariants that chemists are currently employing to characterize chemical 
species. 

(iii) There exists the possibility that solutions to chemical problems found by 
chemists may uncover new mathematical knowledge. Here our example is 
provided by the proof of the so-called pairing theorem in a chemical context (see 
section 5). 

(iv) The increasing use of graph-theoretical and combinatorial methods in 
chemistry means that, as more chemists become aware of the power and elegance 
of these methods, further research into chemical applications will be stimulated 
and make chemistry into an increasingly important proving ground for such 
methods. We mention in passing that there are already more applications in 
chemistry than in any other scientific discipline. 

(v) To cope with the mounting number of publications in the areas of chemical 
graph theory and chemical combinatorics, new outlets will probably be needed as 
the existing journals become overloaded. The pressure of publications has so far 
spawned two new interdisciplinary journals to keep pace with papers being 
generated in this area. The journals are Mathematical Chemistry (known affec- 
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tionately as Match) founded in 1975, and the Journal of Mathematical Chemistry 
founded in 1987. 

A key question pertaining to the use of graphs in the chemical domain is 
whether they contain sufficient information to function as effective descriptors of 
chemical systems. Most of the graphs used by chemists to date have been planar, 
loopless and nondirected. The question ultimately resolves itself into whether the 
graph connectivity (usually referred to by chemists as the graph topology) of such 
graphs yields enough insights to make possible prediction of the behavior of 
systems of chemical interest. This issue has been a recurring and controversial 
one. Workers in chemical graph theory and chemical combinatorics have con- 
sistently maintained that graphs are highly relevant in chemistry (Trinajstié 1983). 
Others have contended that graphs have a role to play only in fringe chemical 
problems. What cannot be doubted, however, is the remarkable upsurge in the 
number of publications in the field in recent years. The decades of the 1970s and 
1980s witnessed a steady increase of around 25% per annum in the output of 
papers, and this trend seems to be continuing into the 1990s. Among this wealth 
of material are to be found comprehensive review articles (Rouvray 1974, 
Balaban 1976, Rouvray and Balaban 1979, Balasubramanian 1985, Balaban 
1985), conference proceedings (King 1983, King and Rouvray 1987, Klein and 
Randi¢é 1990), and specialist monographs (Balaban 1976, Trinajstié 1983, Ken- 
nedy and Quintas 1988, Merrifield and Simmons 1989, Johnson and Maggiora 
1990, Rouvray 1990b, Bonchev and Rouvray 1991, 1992). Chemical graph theory 
is clearly a burgeoning ficld of scientific research activity at the present time. 


2. Early uses of graphs 


To understand how this came about, we trace bricfly the origins of the current 
interest in graphs by the chemical community. Graphs have been used in the 
physical sciences for over two centuries and some of the earliest uses were in a 
chemical setting (Rouvray 1990a). For instance, the various interactions between 
the pairs of molecules in sets of molecules undergoing chemical transformations 
known as double decompositions were represented by reaction graphs as early as 
1768. The molecules were depicted as the nodes and the interactions as the edges 
of the graph; an example of such a graph is shown in fig. 2.1(a). The first 
depictions of the structure of individual molecules were made by William Higgins 
in 1789. Here the nodes now represent the atoms and the edges the supposed 
forces holding the atoms together, as seen in fig. 2.1(b). One of the earliest 
representations of the benzene ring (C,H,), reproduced in fig. 2.1(c), is to be 
found among the 368 graphical depictions of chemical structures by Loschmidt in 
1861. Another pioneer in the early use of graphical formulas was Crum Brown, 
who also in 1861 first drew formulas such as that shown in fig. 2.1(d). The use of 
graphs to depict the structure of molecules has been the most common application 
of graph theory to date in the chemical domain. 

It is interesting to observe here that the word graph itself is of chemical origin. 
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Figure 2.1. Some early graph-theoretical representations of molecules dating from the years (a) 1768, 
(b) 1789, (c) 1861 and (d) 1861. 


The word actually derives from the graphic notation of the chemists of the last 
century — a term that was widely used to denote the structures of molecules such 
as that shown in fig. 2.1(d) (Crum Brown 1865). The mathematician Sylvester 
first proposed that the abbreviated form graph be employed to describe these 
chemical structures (Sylvester 1878). From this point on, the word graph appears 
with increasing frequency in the mathematical literature. It did not catch on 
immediately, however, among the chemical community. Chemists began to refer 
to graphic notation as the structural formula, and this eventually evolved into the 
now more current constitutional formula. The term tree was first introduced into 
the chemical literature somewhat earlier when the mathematician Cayley pub- 
lished a chemical paper describing a method for the cnumeration of various 
classes of molecule that constitute the members of homologous series (Cayley 
1875). A homologous series is a series of related molecules in which each member 
is represented by a general formula and successive members differ from cach 
other by a methylene (CH,) unit. More on the carly history of graph theory and 
combinatorics is to be found in chapter 44 by Biggs et al. on “The History of 
Combinatorics”. 

Determinations of the permissible structures that molecules may possess rely on 
two basic combinatorial formulas. The first is the formula giving the number of 
independent cycles or cyclomatic number of a graph, and is usually expressed as 


w(G)=m-n+1, (2.1) 


where m and 7 are, respectively, the numbers of edges and vertices in the graph 
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G. When employed in a chemical context, «(G) includes the multiple bonds 
present in molecular species. Thus, a double bond (represented by a graph- 
theoretical 2-cycle) becomes a single cycle and a triple bond two cycles. In fact, 
the cyclomatic number can be used to define the extent of saturation in 
hydrocarbon and other molecules (Rouvray 1975). The second combinatorial 
formula used is the well-known Euler polyhedral formula: 


n-mt+o=2, (2.2) 


where ¢ represents the number of faces in the polyhedron. Normally only convex 
polyhedra that can be mapped onto the sphere are considered. Both of these 
formulas have found extensive use in the chemical domain in tasks ranging from 
the computer generation of isomeric structures (Golender and Rozenblit 1983), 
through identification of chemically favored coordination compounds (King 


1969), to assessment of the inherent rigidity in molecular polyhedral isomeriza- 
tions (King 1988). 


3. Isomer enumeration techniques 


Although now primarily of historical interest, graph-theoretical and combinatorial 
techniques have proven themselves indispensable in the enumeration of chemical 
isomers. This application represents the carliest use of these techniques for the 
solution of computational chemical problems. An isomeric pair of molecules 
consists of two molecules of the same atomic constitution that differ in at least 
some of their properties. Chemical species can exhibit many different types of 
isomerism and there exist numerous classification schemes for isomers (Slanina 
1986). The two fundamental classes of isomers are the constitutional isomers (also 
known as structural isomers) and stereoisomers. Constitutional isomeric pairs 
differ in their atomic connectivity whereas stereoisomeric pairs differ in the 
relative positioning of their atoms in space, the different ways of positioning four 
different substituents around a tetrahedral carbon atom being a typical example. 
All isomeric species should be stable for periods of time long in comparison to 
those in which measurements of their propertics are made. Isomer enumeration 
yields not only isomer counts but also affords a means of estimating the number 
of stationary points on the potential energy hypersurfaces used to characterize 
molecular structures. This renders the enumeration of chemical graphs a valuable 
adjunct to theoretical studies on chemical reactivity (Slanina 1986). A more 
detailed discussion on isomers and their enumeration is to be found in the surveys 
of Rouvray (1974), Balaban (1985) and Simon (1987), and in the book of Slanina 
(1986). 

Broadly speaking, the now mature field of isomer enumcration may be viewed 
as having passed through four principal phases, which have involved the 
successive exploitation of (i) recursion formulas, (ii) generating functions, (iii) 
the enumeration theorem of Pélya (1937), and (iv) more recent techniques, such 
as the double coset method of Ruch et al. (1970) and the reformulated method of 
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Redfield (Redfickd 1927, Davidson 1981, Lloyd 1988). The areca as a whole 
illustrates perhaps better than any other in chemical combinatorics the close 
interplay and mutual stimulation that has existed between the mathematical and 
chemical communities for well over a century. The net result of this long | 
collaboration in the enumeration of chemical isomers is that at this stage virtually 
all species of chemical interest have now been enumerated. One notable 
exception here concerns the polycyclic aromatic hydrocarbons that may be 
represented as planar hexagonal animals, i.e., planar graphs with no cut vertices 
in which every interior region isa hexagonal unit cell. No general enumerative 
procedure for such graphs exists, though computations have been made for 
polyhexes containing up to sixteen hexagons (Knop et al. 1990). Over the past 
few years tables of isomer counts for various classes of molecules have begun to 
appear (Knop et al. 1985, Dias 1987). 

The earliest use of a combinatorial technique to determine isomer counts was 
that of the chemist Flavitsky in 1871. He studicd the homologous scries of the 
alcohols and enumerated the first ten members. The general formula for the 
alcohols is C,,H,,,,,OH and the members are subdivided into three classes known 
as primary, secondary and tertiary alcohols. Class assignment is made on the basis 
of the number of carbon atoms attached to the carbon (C) atom bearing the 
hydroxyl (OH) group. For primary alcohols this number is one, for secondary 
alcohols two, and for tertiary alcohols three; one typical member of each class is 
illustrated in fig. 3.1. In mathematical terms the problem is cquivalent to 
enumerating rooted tree graphs all of whose vertices are of valence one or four. 
Flavitsky made use of recursion relations to cnumerate the alcohols; examples of 
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Figure 3.1. Examples of a primary, secondary and tertiary alcohol. 
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the kinds of relation he employed are: 
Pea tees (3.1) 
and 
$y = TT, 2 + ToT, 3 H+ My ayi2T aig (ns even), 
Sal st Plt Tach laa iia (3.2) 
+ F(T -y2Q + Te -sy2)) (2 odd), 


where p,, s, and T,, are respectively the number of primary alcohols, the number 
of secondary alcohols, and the total number of alcohols of all kinds containing n 
carbon atoms. This type of approach was greatly extended in the 1930s when the 
chemists Henze and Blair develqped recursion formulas for the enumeration of 
many different homologous series, including the alkanes, alcohols, amines, ethers 
and organic acids (sce Rouvray 1974). Several further elaborations of this work 
have been made in recent years (see Balaban 1985). 

Generating functions were introduced into isomer enumeration studies by the 
mathematician Cayley. In 1857 he had enumerated rooted trees by means of a 
generating function of the form: 


(aa Py Laat ete dt At Age te, 
(3.3) 


where x is a variable, n the number of vertices, and A,,_, is the coefficient of x"! 
which gives the number of rooted trees on ” vertices. In 1874 Cayley adapted this 
function to the enumeration of unrooted trees; such trees are equivalent to 
members of the alkane homologous series, C,H,,,,.- By 1875 he had published a 
long paper on enumerating trees of maximal valence four (alkanes), three and two 
(Cayley 1875). Cayley succeeded in enumerating up to the thirteenth member of 
the alkane series, though two of his results were later shown to be in error. 
Modern isomer counts for alkane molecules containing up to 50 carbon atoms and 
also for alcohol molecules containing up to 40 carbon atoms based on the 
compilations of Knop et al. (1985) are presented in tables 3.1 and 3.2 respective- 
ly. 

A major step forward in the enumeration of graphs, including many graphs of 
chemical interest, was made by Pélya (1937). The key part of this work, originally 
known as the Hauptsatz, is nowadays universally referred to as the enumeration 
theorem of Pélya or Redfield~Pélya, to commemorate the fact that much of 
Pélya’s work had been adumbrated in Redfield’s classic paper (Redfield 1927, 
Lloyd 1988). Pélya’s complete paper was translated into English exactly 50 years 
after its appearance (Pélya and Read 1987); by then its relevance in the chemical 
world had been widely recognized. Polya also published several follow-up papers 
illustrating how his theorem could be applied to the enumeration of molecules 
(see Rouvray 1974). The enumeration theorem rests essentially on an integrated 
use of symmetry classes of graphs, generating functions and weighting factors. In 
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Table 3.1 
Enumeration of members of the alkane homologous series, C,H), ,, 
Value of n Isomer count 

| 1 

2 1 

3 i 

4 2 

5 3 

6 5 

7 9 

8 18 

9 35 

10 75 

20 366319 

30 4111846763 

40 62481801147341 

50 1117743651746953270 

Table 3.2 
Enumeration of members of the alcohol homologous series, C,,H,,,, QOH 
Value of n Primary Secondary Tertiary Total 
1 1 0 0 t 
2 1 0 0 1 
3 1 1 0 2 
4 2 1 1 4 
5 4 3 1 8 
6 8 6 3 17 
7 17 15 7 39 
8 39 33 17 89 
9 89 82 40 211 
10 2u 194 102 507 
20 2156010 2216862 1249237 5622109 
30 35866550869 37977600390 22147214029 95991365288 
40 720807976831447 773973501324306 459220572506066 1954002050661819 
concise form, the theorem may be expressed by the equation 
s 
Ce) = 19" D ay TF), G4) 
i] = 


where C(x) is a configuration counting series, f(x) a figure counting series, @ is a 
permutation group of degree g and order g, g;,) is the number of elements of G of 
type (j) with (/)= (i, j2.--->4,), J, the number of cycles of length k with 
k=1,2,...,q, and the summation extends over all partitions (j)’of g subject to 
the condition 


ji t+2j,4+-:-+4i,=4. (3.5) 


By making use of the cycle index of the permutation group & arising from some 
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enumeration problem, Pélya’s approach reduces enumeration of the possible 
combinatorial configurations to enumeration of their equivalence classes. The 
cycle index is defined as follows: 


2G) =|9|"' 2 Naf epeerd (3.6) 
/) 


where N,,, is the number of elements of & of cycle type (/) and the f terms are 
indeterminates. As an example of a cycle index of chemical relevance, we 
consider that for the benzene molecule, C,H,, which is comprised of six carbon 
atoms in the form of a regular hexagon with a hydrogen atom attached to each 
carbon. Its permutation group @ is the point group D, consisting of twelve 
different elements that generate the cycle index for benzene: 


Z(C Hg) = CSG + 463 + PFS F OF5 + 2f,)- (3.7) 


Substitution of an appropriate generating function to replace the f indeterminates 
transforms (3.7) into a generating function for nonequivalent configurations. In 
the case of progressive replacement of the hydrogen atoms in benzene by a single, 
monovalent substituent X the appropriate function to substitute is 


fc) =14x. (3.8) 


Substitution of (3.8) into (3.7) yields the configuration counting series for the 
benzene molecule: 


CQ) = tt xt 3x + 3x74 3xt th tx®. (3.9) 


The coefficients of C(x) give the numbers of substitutional isomers obtainable 


when the hydrogen atoms in benzene are progressively replaced by X. The 
complete set of substitutional isomers is illustrated in fig. 3.2. 
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Figure 3.2. The substitution products obtained upon progressively substituting a benzene ring (C,) 
with a monovalent substituent. 
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Pélya’s enumeration theorem has been employed to determine isomer counts 
for numerous systems of chemical interest (Rouvray 1974, Balaban 1985, 
Balasubramanian 1985). Major contributions to this area have been made by 
Balaban and his co-workers (Balaban 1985). Solutions have been found for the 
enumeration among others of stereoisomers, valence isomers, cubic graphs and 
constitutional isomers. Valence isomers are a subclass of constitutional isomers for 
which the hydrogen-depleted molecular graphs have identical partitioning of their 
vertex degrees. A number of the problems encountered in this work relate to the 
well-known ‘“‘necklace problem” in combinatorics; comprehensive coverage of this 
area is to be found in a three-volume monograph by Balaban (1987). The 
application of Pélya’s theorem to the enumeration of spectroscopic signals in 
nuclear magnetic resonance studies on molecules has been discussed by Balasub- 
ramanian (1985). In this case the number of signals is given by the number of 
magnctic equivalence classes for the molecules of interest. The converse problem 
to isomer enumeration, namely determination of the extent 10 which it is possible 
to deduce molecular symmetry from the spectrum of isomer counts for a given 
chemical species, has been addressed by Hasselbarth (1987). 

Starting in the early 1970s interest began to be focused on the enumeration of 
so-called nonrigid molecular species, that is molecules which change their 
conformation or undergo other intramolecular changes during the time scale of 
experiments carried out on them. Such studies involve separate considerations of 
(i) the underlying molecular framework, and (ii) the relative positions of the 
substituents X attached to the framework. Both have given rise to some 
interesting combinatorial problems (Maruani and Serre 1983, Brocas 1986). In 
case (i) the symmetry point group appropriate for the idealized rigid framework 
will not correctly characterize the symmetry of the actual structure. It is necessary 
to employ a new group to allow for the additional symmetry elements introduced 
by the various intramolecular motions. In case (ii) the isomers formed as a result 
of substituent interchanges are referred to as permutational isomers. Whenever a 
new isomer is formed in this way, a permutational isomerization reaction is said to 
have taken place. The total possible number of such reactions for a molecule with 
n attached identical X substituents will be n!, though this number is not attained 
for molecules possessing a point symmetry. The n! isomerization reactions will be 
reduced because (i) some of the permutations will not generate new isomers as 
they represent proper rotations of the molecular point group, and (ii) all the X 
substituents are identical and distinguishable only by labels, so some of the 
isomerization reactions will necessarily be indistinguishable. 

Several different procedures have been developed to enumerate the isomers 
arising from nonrigid molecular species. The most general way, however, remains 
that of Redfield (1927). When expressed in modern terminology, his method is 
scen as bringing together in one powerful generalization the disparate strands of 
various enumerative procedures introduced by Pdélya and later workers down to 
the present time (Davidson 1981). Redficld’s approach envisaged m sets 2, 
X,,...,2#,,, each of which contained 1 elements arranged in horizontal x-tuples 
to form m Xn» matrices. Those matrices that are column equivalent are said to 
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have the same correspondence in their elements. In all, there are (n!)" such 
matrices and (n!)"/n! correspondences. The problem is to enumerate the number 
of injective mappings whereas Ruch, Klemperer and Brown and co-workers (see 
Davidson 1981) used double cosets. Double cosets have been the focus of much 
interest in the chemical literature of the past two decades; there is even a review 
of their various applications in the physical sciences (Ruch and Klein 1983). 
Double cosets partition groups in terms of pairs of subgroups. If subgroup A of 
the symmetric group, S,, of degree v contains all the symmetry permutations 
arising from rotation of a molecular framework as a whole plus the intramolecular 
rotations of that framework, and subgroup B CS, contains all the permutations of 
identical substituents X on the framework attachment sites, then the double coset 
A g B will consist of the permutations: 


Ag B={ag b\a€GA,bEB}, géS,. (3.10) 


This particular formalism has been employed not only in the enumeration of 
permutational isomers but also for polytopal rearrangements and the chemical 
reactions of molecules (Brocas et al. 1984). 


4. The study of reaction networks 


As mentioned in our historical introduction, the earliest use of graphs in a 
chemical context was for depiction of the mutual interactions of sets of molecules. 
Such usage has become commonplace in recent decades, even though it is now 
realized that graphs or even hypergraphs are not necessarily the best mathemati- 
cal constructs to represent chemical reactions. A chemical reaction may be defined 
as sequence of elementary steps {¢,} in each of which a set of molecules is 
chemically transformed into some other set. A lincar combination of such steps of 
the general form 


M=a,l,+a,6,+-:'+a,@,, (4.1) 


where the a, are real coefficients, is referred to as a mechanism of the reaction. A 
collection of single-step chemical reactions that are interlinked by the flow of 
molecules from one reaction to another are said to constitute a reaction network. 
A reacting chemical system can thus be viewed as a linear transformation from a 
k-dimensional real vector space generated by the elementary steps into an 
A-dimensional real vector space generated by the molecules (Milner 1964, Sellers 
1984). 

A reacting system can be adequately represented by a graph only in certain 
special instances. An example of such an instance occurs when every elementary 
step in the network is an isomerization reaction. In this case, two molecules, a 
and £8, undergoing reaction will be governed by the chemical equation a = B. 
Every species can then be represented by a graphical vertex and every reaction by 
an edge or sequence of edges of a graph. Since isomerization reactions involve 
only the molecular framework and the various transformations it undergoes, the 
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substituents can be neglected to a first approximation. Moreover, because the 
initial and final geometries of the framework remain identical in such reactions, 
the distinguishability of isomerization reactions will be determined by the 
condition 


r= Ait Ay! - ht (4.2) 


where +, and 7, represent isomerization reactions carried out in a totally symmetric 
environment, “and fh, © #, the permutation group that acts on the indices of the 
framework attachment sites, and a subgraph of S,. If condition (4.2) holds, the: 
reactions 7, and 1, are indistinguishable, whereas if it does not the reactions are 
distinguishable. For a molecule in a less symmetric environment, the full point 
group cannot be used in this way to define reaction distinguishability. In such 
cases this group is replaced by a subgroup defining the symmetry of the site that 
the molecule occupies (Klemperer 1972). 

The condition (4.2) is described as a conjugacy relation. Each of the conjugacy 
classes generated in S, with respect to the group # will represent a set of 
indistinguishable permutational isomerization reactions in a totally symmetric 
environment, provided we neglect those classes comprised of elements of the 
proper rotational symmetry group ¥ for molecules in a chiral environment. The 
number of conjugacy classes in S, with respect to # minus the number of classes 
comprised of elements of P yields the number of formally distinguishable 
reactions in a symmetric environment. Klemperer (1972) expressed this result 
mathematically by developing a formula for counting the number of conjugacy 
classes in S, with respect to #€. A counting polynomial was defined as: 


CHa... 6) = AL, aitaltes sale, (4.3) 


Cd ooh) . 


where A,, iat In is the number of conjugacy classes in S, with respect to # of cycle 
type (j,, i ...,J,) and the a terms are indeterminates. The number of dis- 
tinguishable reactions is given by the formula 


ing in = Aint in Ninian? (4.4) 
where N,.j--j, 8 the number of conjugacy classes in # of cycle type 
Rae prec jn) and A’ pertains to the permutations generated by proper 
rotations of the molecular framework. 

As indicated, reaction networks cannot in general be adequately represented by 
graphs or 1-dimensional simplicial complexes. In this respect they differ markedly 
from electrical networks, a fact that might be anticipated in view of the 
complexity of most chemical reactions. By redefining a reaction network as a 
finite set of chemicals together with a finite set of reactions such that there exist a 
homomorphism @: €,—> €,, where €, and ©, are respectively the free abelian 
groups generated by the chemicals and the reactions, Sellers (1967) was able to 
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establish the following result: 
w7-NA-h7~ Mm» (4.5) 


where y, and y, are the respective numbers of chemicals and reactions, 7, is the 
maximum number of linearly independent conservation conditions, and », the 
maximum number of stationary states. The algebraic complexes that can be used 
to represent reaction networks in which there are no isomerization reactions have 
been discussed in detail by Sellers (1967). 

An interesting combinatorial problem that arises in the study of reaction 
mechanisms is the construction of the complete listing of the chains of steps by 
which a specified chemical reaction can proceed in a given reaction network 
(Sellers 1989). The problem is analogous to, though more difficult than, finding 
all the paths connecting a pair of vertices in a graph. A definition applicable both 
to graph paths and reaction mechanisms is that a mechanism is said to be direct 
whenever no distinct mechanism for the reaction can be formed from a subset of 
its steps. This implies that a mechanism is direct iff no cycle can be found from a 
subset of its steps. Sellers (1989) has established that any reaction mechanism can 
be decomposed into a sum of irreducible direct mechanisms. Although such a 
decomposition will not in general be unique, the set of all direct mechanisms for a 
specified reaction will be a unique characteristic of a reaction network. Because 
direct mechanisms are not in general linearly independent, any list of supposedly 
distinct linear combinations will contain repetitions. Moreover, the list will be 
infinitely long unless the sect of all such mechanisms is separated into cquivalence 
classes. 

Sellers (1989) developed a systematic way of determining all the possible direct 
mechanisms for one or more overall reactions in a reaction network based on 
concepts adapted from algebraic topology. Each possible type of mechanism was 
characterized by a convex subset of a finite-dimensional vector space. ‘Use was 
made of geometric constructions in which mechanisms were modeled by a finite 
set of polygonal faces meeting at their edges, the structure as a whole being 
characterized in terms of the incidence relations existing between the vertices and 
edges. Direct mechanisms were obtained from the intersection of the é hy- 
perplanes representing cycles, i.e., closed successions of edges, or pairs of 
independent mechanisms that produce the same reaction in opposite directions, 
selected from a possible set of j hyperplanes representing the mechanisms. A 
computer algorithm screened each of the (/) ways of selecting the i hyperplanes. 
Any selection yields a system of i linear equations and describes a direct 
mechanism iff the i hyperplanes intersect at a point. The coefficients of the direct 
mechanism can be determined by solution of the linear equations. An estimate of 
the mechanistic complexity of a reaction based on the kinetic laws of chemical 
teactions has been advanced by Bonchev et al. (1987). Moreover, a comprehen- 
sive review and a systematic analysis of the concept of the reaction mechanism as 
employed in the chemical literature has been presented by Masavetas (1988). 

Extensive use has been made of computer programs in recent years for the 
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study of reacting systems. For instance, plausible routes for the synthesis of 
complex target molecules. from comparatively simple starting structures, known as 
synthons, have been devised. The concept of the synthon as a minimal fragment 
of a molecular system that is necessary for some given chemical reaction to take 
place was first proposed by Corey (1967). Since that time, the synthon has 
become an object of mathematical investigation in its own right and various 
mathematical models of synthons have been put forward. Two examples are those 
of Koéa based on a matrix formulation (Koéa 1989) and of Kvasnicka and 
Pospichal based on a graph-theoretical model (Kvasnitka and Pospichal 1990). 
The latter work claborated the concept by introducing additional concepts such as 
synthon stability, subsynthons and familics of isomeric synthons. The new 
concepts were used in the development of a theory of synthons that closely 
reflects the reasoning of synthetic chemists. Sets of all possible precursors and 
successors of given synthons are generated from either linear or cyclic graphs of 
the relevant reaction mechanism. This means that even chemical reactions having 
complicated reaction graphs can be expressed as sequences of simple graphs of 
the reaction mechanism. 

The prediction of plausible routes for chemical syntheses is feasible only 
because the number of known reaction types is finite and fairly small. Programs 
that construct reaction sequences to proceed from a small number of synthons to 
some target molecule usually explore exhaustively every possible combination of 
routes, and report on those that seem most promising in the probable order of 
their success. There are, however, currently two fundamentally different ap- 
proaches to the design of synthesis programs (Ash ct al. 1985). One utilizes a 
library of known reactions and proceeds from the target molecule in a re- 
trosynthetic direction; examples of such programs are LHASA (logic and 
heuristics applied to synthetic analysis), SYNCHEM (synthetic chemistry), SECS 
(simulation| and evaluation of chemicat synthesis) and CASP (computer-aided 
synthesis program). The other approach is nonempirical in that it attempts to 
represent structural transformations in a generic way by manipulating an abstract 
model of the atoms and chemical bonds involved in the reaction sequence. 
Examples of this approach include EROS (elaboration of reactions for organic 
synthesis) and CAMEO (computer assisted mechanistic evaluation of organic 
reactions). Whereas the former approach is essentially a brute force method, the 
latter is more subtle in that is can search for novel synthesis routes in unexplored 
areas of chemistry and so make completely new syntheses a reality. 

Of these two approaches, the latter is likely to be of greater benefit to the 
synthetic chemist in the long term. One major advantage is that it harnesses the 
methodology of artificial intelligence for use in the chemical domain. In par- 
ticular, it promises the possibility of expert systems that can elicit new chemical 
knowledge (Pierce and Hohne 1986). The first use of artificial intelligence in 
synthetic chemistry was made in the late 1960s. The DENDRAL (dendritic 
algorithm) program was published in 1969 (see Lindsay et al. 1980) for the 
purpose of generating exhaustive lists of chemically meaningful isomers of 
structures containing carbon, hydrogen, nitrogen and oxygen atoms. This pro- 
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gram used a linear notation for tree structures that started from the center of the 
tree and avoided nested parentheses. Heuristic DENDRAL is able to construct 
all of the isomers compatible with experimental data. Since this early work, the 
applications of artificial intelligence have become increasingly sophisticated. For 
example, in recent years the expert system SYNLMA (synthesis with logic 
machine architecture) has used a theorem prover based on a collection of Pascal 
subroutines in which the target structure is regarded as the theorem to be proved 
and the starting materials are the axioms (Wang et al. 1986). The expert system 
SYNCHEM2 is based on a graph embedding technique that determines whether a 
guest graph can be embedded in a host graph with preservation of adjacency 
relationships, vertex and edge labeling, and stereochemical orientation. Graph 
embedding, which is an NP-complete problem, is thus of relevance in the 
chemical context (Benstock et al. 1988). 


5. Polynomials in bonding theory 


The (vertex) adjacency matrix of a graph, denoted here as A(G), has been widely 
used in the modeling of individual chemical molecules and a variety of other 
chemical systems. Moreover, a fair number of chemical combinatorial problems 
have arisen from studies on A(G) and the polynomials that may be derived from 
it. Examples of its uses include the characterization of molecules, the derivation 
of graph invariants to model chemical systems, and the elucidation of bonding 
theory. In fig 5.1(c) we illustrate the adjacency matrix for the six atoms forming 
the carbon skeleton of the benzene molecule (C,H,). Because A(G) characterizes 
graphs up to isomorphism, the matrix has been employed to represent chemical 
species, including many isomers. In fact, the connection tables used by Chemical 
Abstracts Service are essentially adjacency matrices in that they list the atoms in a 
molecule with all the bonding linkages. Over 10 million connection tables 
representing all known molecules are stored in Chemical Abstracts Service 
registry files at present. Numerous attempts have been made to transform 
connection tables into a unique representation; each has sought to give a unique 
and invariant ordering to the structure. One of the simplest of these was 
developed by Morgan (see Ash et al. 1985); this partitions atoms into classes 
based on their connectivity. An iterative process is employed that successively 
calculates higher-order connectivities for each atom in the molecule. The process 
terminates when the number of classes ceases to increase. This is the process 
currently used by Chemical Abstracts Service for the unique labeling of structures 
in its databases. 

As A(G) is not appropriate for the characterization of stereoisomers it has 
been necessary to modify the Morgan algorithm. Modifications to date are able to 
accommodate not only the stereochemistry but also additional properties of the 
atoms bonded in structures, such as their atomic number, and the various kinds of 
chemical bonding encountered (Ash et al. 1985). For instance, the SEMA 
(stereochemically extended Morgan algorithm) is able to produce a stereochemi- 
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Figure 5.1. Some graphical and matricial representations of the benzene molecule, C,H,. From the 
left in (a) are depicted the usual chemical structural formula, the graph for benzene, and the graph for 
the carbon skeleton (C,) of benzene. (b) shows the Hiicke! matrix (c) the adjacency matrix, and (d) 
the distance matrix of the C, ring of benzene. 


cally unique name for a molecule via a connection table generated from a graphics 
structure input. This name consists of a variable-length string with components 
specifying the length of the string, the number of non-hydrogen atoms, bonds 
including multiple bonds, and tetrahedral stereocenters (see Ash et al. 1985). In 
the case of constitutional isomers, in which the stereochemistry plays no role, 
A(G) has frequently been expressed in the form of a code. A unique characteriza- 
tion can be achieved via a canonical numbering of the graph vertices such that the 
string of binary digits in A(G) read from left to right yields the smallest binary 
representation. Codes of this type. have been employed to establish both the 
isomorphism of chemical structures and the equivalence of atoms within a given 
structure by means of computer algorithms (Randi¢ et al. 1981). The derivation 


of chemically relevant graph invariants from A(G) is addressed in the next 
section. 
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nical use of A(G) to date has been in the development of 

J) came into prominence in the 1950s when it was realized 

phic to the historically important Hickel matrix, H(G) (see 

allion 1982). H(G) is a real, symmetric matrix originally derived 

. result of the quantum-chemical analysis of hydrocarbon species; 

represent the energy levels in molecules and its eigenvectors yield 

the electronic charge density within molecules (Mallion 1982). The 

Hur. ix for the carbon skeleton of the benzene ring is shown in fig. 5.1(b). 

Before u.. advent of high-speed computers, the Hiickel approach afforded a 

convenient means of obtaining an approximate solution to the Schrédinger wave 

equation for the species in question. Although this approach has long since been 

superseded by more sophisticated methods, it can still provide valuable insights 

into the behavior of chemical systems and remains a useful pedagogical tool. A 

number of the results obtained using Hiickel theory retain their validity even for 

the higher approximation methods. For example, in so-called alternant hydro- 
carbons (which possess bipartite graphs) A(G)) assumes the general form, 


AG) = ly ale (5.1) 


where B' is the transpose of the submatrix B. As a consequence of this, the 
eigenvalues 4, of such graphs will exist as pairs, i-e., 


A= A, +1-9 (1<i<h@-1)). (35.2) 


This result, which is of considerable interest to chemists, was first obtained in the 
chemical context by Coulson and Rushbrooke (1940) and is known to chemists as 
the pairing theorem (see Mallion and Rouvray 1990). The theorem had however 
been adumbrated in the work of Perron and Frobenius dating from 1907-1912. 
They had shown that, if A(G) has # eigenvalues Ay, A,,...,A,-, viewed as a 
system of points in the Argand A-plane, the spectrum maps onto itself under 
rotation by 27/h. However, Perron and Frobenius had not made the connecting 
observation that A(G) for a bipartite graph is an irreducible matrix with h = 2, 
whereas A(G) for a nonbipartite graph is such a matrix with A = 1. Had this 
sequitur been made at the time the mathematical analogue of the Coulson- 
Rushbrooke theorem could have been proved by 1912. For a discussion of the 
history of this theorem, see the review of Mallion and Rouvray (1990). 

A polynomial derived from A(G), known as the characteristic or spectral 
polynomial, has been studied by chemists almost as much as A(G) itself. The 
characteristic polynomial, P(G), is defined by the relation 


P(G) = det(xI ~ A(G)) = a,x," (5.3) 
i-a 
where I is the m Xn identity matrix. The roots of P(G) are the eigenvalues of 


A(G) and these may be ordered into a sequence known as the spectrum of G. 
Much effort has been expended on determining both the eigenvalues of A(G) and 
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the coefficients a, of P(G) since the inception of Htickel theory (see Rouvray 
1976). Many expressions in closed form have been developed in the chemical 
context for both the eigenvalues and the coefficients (Graovac et al. 1977, 
Trinajsti¢ 1983). Expressions for the a; of graphs of polycyclic aromatic hydro- 
carbons (represented as polyhexes or hexagonal animal graphs) in terms of graph 
invariants of these systems have been given by Dias (1985). Among procedures 
for determining the a,, mention should be made of Coates’ formula, methods 
based on Ulam subgraphs, recurrence formulas, and the Le Verrier—Faddeev— 
Frame method; all these have been surveyed by Trinajstié (1988). Coates’ 
formula as used by Sachs, for instance, gives the a(G) as 


a(G)= > (-12, (5.4) 


sES, 


where s belongs to the set of so-called Sachs graphs S, on i nodes, and u(s) and 
c(s) denote respectively the number of components and the number of circuits in 
s. The components of a Sachs graph are either K, graphs or C, cycles (p = 
3,4,...,n) or combinations thereof. One recent example of an efficient com- 
puter algorithm developed by a chemist for evaluating P(G) (of the order n’, 
where n is the number of graph vertices) based on the Le Verrier—Faddeev— 
Frame method is that of Zivkovié (1990). The numerous applications of P(G) in 
chemistry, especially in the context of bonding theory, have been discussed by 
Trinajsti¢ (1988). 

lf the C, cycles are omitted from (5.4), a new polynomial is obtained known to 
mathematicians as the matching polynomial and to chemists as the acyclic 
polynomial. The a;“(G) coefficients for this polynomial, designated below as 
M(G), thus assume the form 


a(G)= 2 (-1), (5.5) 


sess 


where S** is the set of acyclic Sachs graphs on i nodes. This polynomial has been 
independently introduced into the literature several times because of its manifold 
applications in physics and chemistry (Cvetkovié et al. 1988, Godsil, chapter 31 in 
this Handbook). One of the major uses has been as a reference polynomial to 
enable comparisons to be made between the bonding energy in otherwise 
equivalent molecules which exist in the form of both a chain and a closed ring 
structure. The difference is known to chemists as the topological resonance 
energy and affords valuable insights into the stability of molecules (Trinajsti¢ 
1983). The polynomial can also be interpreted as a generating function for the 
number of matchings in graphs. If we denote the number of k-matchings in G as 
N(G,k), the matching polynomial may be expressed as 


[n/2] 


M(G) = = (-1S MG, kx", (5.6) 
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For all graphs G, N(G. 0) is defined to he one, N(G, 1) is equal to the number of 
edges m and N(G,k) =0 for all k>m. For n even, the value of N(G, 1/2) is 
described in chemical terminology as being equal to the number of Kekulé 
structures K of the molecular graph and the matching is said to be perfect. Kekulé 
structures have played a major role in the development of chemical bonding 
theory, especially insofar as this relates to the stability of molecules whose graphs 
assume the form of polyhexes (Trinajstié 1983). One interesting observation is 
that the number of Kekulé structures is closely related to the terminal coefficients 
of P(G) and M(G), i.e., 


a,(G) = K? (5.7) 
and 
aX(G)=K. (5.8) 


Further analysis of these polynomials in the chemical context, the development of 
a more general polynomial incorporating both P(G) and M(G) as special cases, 
and discussion of a number of related polynomials are to be found in the 
monographs of Gutman and Polansky (1986) and Cvetkovié et al. (1988). 


6. Invariants and molecular properties 


In addition to polynomials, chemists have made extensive use of scalar numerical 
graph invariants for the characterization of chemical species. The first such 
invariant was introduced some 150 years ago when the number of carbon atoms in 
hydrocarbon molecules was adopted for the characterization of these species 
(Rouvray 1990a). This particular invariant is known today as the carbon number 
index because it equals the number of carbon atoms in the hydrogen-deleted 
molecular graph of a hydrocarbon species. (Deletion of the hydrogen atoms is a 
common practice in this field as the hydrogen atoms are usually not structure- 
determining.) Scalar numerical invariants are nowadays commonly referred to by 
chemists as topological indices. Another example of such an invariant is the 
cyclomatic number (sec section 2), which has been in use for the characterization 
of molecules for well over a century (see Rouvray 1975). The chemical fascination 
with such invariants stems from the fact that they can be interpreted as a property 
of a molecule on a par with measured properties such as the reactivity. In fact, 
once a correlation with an invariant has been established for a test subset of 
related molecules, prediction of the property in question can normally be made 
for all the remaining molecules in the set (see Rouvray 1986). 

In many instances, the use of numerical invariants for property prediction rests 
on an implicit application of the additivity principle, which holds whenever a 
molecular property may be obtained as the sum of the contributions to that 
property from each of the constituent parts of the molecule (See Rouvray 1990a). 
More formally, a molecular property 6 of some systems / is said to be additive 
whenever that system can be broken down into noninteracting subsystems w, and 
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@, according to the equation 
OS) —> O(w, U w,)-> 0(w,) + O(w,) . ; (6.1) 


Such properties are often encountered when sets of related molecules are 
considered, e.g., the members of homologous series. Other types of systematic 
interaction have also been explored and formulated in mathcmatical language by 
Klein (1986). For the additive case, Gordon and Kennedy (1973) showed that any 
molecular property can be expressed as a linear combination of graph-thcoretical 
invariants thus: , 


OAS) =D k,é;. (6.2) 


where the é, are scalar numerical invariants derived from chemical graphs and k, 
are empirically determined coefficients. Properties satisfying this type of relation- 
ship are said to belong to the “graph-like state of matter” (Gordon and Kennedy 
1973). Examples of properties that approximate to this state are the energy, 
entropy, melting and boiling points, and the refractive index. 

To date, over 150 different €, graph invariants have been put forward in the 
chemical literature. Just under one half of these are purely graph-theoretical and 
the remainder are information-theoretical in nature (see Bonchev 1983). There 
are numerous reviews now available on these invariants — see, for example, 
Bonchev (1983), Rouvray (1985, 1986, 1987, 1989), Kier and Hall (1986), and 
Stankevich et al. (1988). A fair number of these invariants derive from either the 
adjacency matrix A(G) or the distance matrix D(G) of the molecular graph 
(Rouvray 1985). The distance matrix for the carbon skeleton of the benzene 
molecule is illustrated in fig. 5.1(d). The first of the invariants designed to 
characterize branched molecules was introduced in 1947 and is referred to today 
as the Wiener index (sec Rouvray 1985). This index is defined by the equation 


WG)=4d D4,(G), (6.3) 


, the index is half the sum of the elements of D(G). Expressions for W(G) in 
heed form are known for many different classes of graph. For instance, the form 
for a path graph is Lin? —n) and that for a star graph is (1 - 1)’. A closed 
expression for the general tree has been developed by Canfield et al. (1985). The 
Wiener index has found application in numerous areas of chemistry and physics, 
and is now regarded as one of the most successful invariants for the prediction of 
physicochemical properties (Rouvray 1985). 

Once of the most interesting invariants from the mathematical point of view is 
that of Hosoya published in 1971 (Hosoya 1985). This invariant he described as 
the topological index, a term that has since been extended to embrace all scalar 
numerical graphs invariants. The Hosoya index is defined as follows: 

[2/2] 


H(G)= & NG.k), (6.4) 
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where M(G, k) is the number of k-matchings in G referred to in section 5. H(G) 
was adapted from earlier work in statistical physics in which matchings and 
polynomials were (and still are) employed as a means of modeling the coverings 
of crystal lattices by diatomic molecules or so-called dimers. A history of this 
earlier work together with a discussion on the physical applications of the 
matching polynomial M(G) are to be found in the work of Heilmann and Lieb 
(1972). It is thus only to be expected that H(G) will be related to several other 
graph invariants, including a number of polynomials. For instance, the values 
assumed by H(G) for path graphs form members of the Fibonacci series, and the 
values for monocycles form members of the Lucas series (sec Rouvray 1987). 
There are now extensive tabulations available for both H(G) and N(G, k) values 
(Hosoya 1985). H(G) is closely associated with the characteristic polynomial 
P(G) of a graph G. In the case of a tree 7, the relationship takes the form 


|n/2] 
P,(x)= 2) (-1)‘N(T, kx" (6.5) 
k=0 
For cycles, additional terms need to be added on the right-hand side (Hosoya 
1985). Moreover, the counting polynomials 


[2/2] 


Qo) = D NT. k)x* (6.6) 


for various classes of graphs G are known to transform into a family of orthogonal 
polynomials, including the first and second. types of Chebyshev, Hermite, 
Laguerre and associated Laguerre polynomials (Hosoya 1985), see Heilmann and 
Lieb (1972) for Chebyshev and Hermite polynomials. The chemical applications 
of FG) have been summarized by Rouvray (1987). 

The most popular of the invariants in terms of the number of applications to 
date is the so-called molecular connectivity index of Randié first proposed in 1975 
(see Kier and Hall 1986). Two entire monographs have been devoted to 
exposition of this index and its various successors (Kier and Hall 1986). In its 
original form, the index was defined as follows: 


x(G)= 2 5)” (6.7) 
¢ ges 
where the 6, and 6, represent the degrees of an adjacent pair of vertices i and j in 
G. The index was subsequently generalized into a series of indices in which the 
summation is made over a variety of subgraphs of G other than the edges. In its 
most general form, y(G) for tree graphs may be expressed as 


Tr h+t 


h, (G)= > 1)", (6.8) 


k=l i=l 


where A is the number of edges in the subgraph used in the summation, r is the 
type of subgraph used, and o, is the number of subgraphs of type r with A edges. 
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So far, very little mathematical analysis of this index has been made, though it is 
evident that h,(G) can be derived from the hth power of A(G). The role of the 
index in characterizing branching has also been discussed (Kier and Hall 1986, 
Rouvray 1988). The index has been employed in successful correlations with a 
host of molecular properties ranging from the physical, through the chemical, to | 
the biological (Kier and Hall 1986). 


7. Some outstanding problems 


In this section bricf reference is made to some of the many important mathemati- 
cal problems that still need to be solved in chemical combinatorics. Many of these 
problems are of interest to chemists and combinatorialists alike. Examples of such 
problems include the general solution of the hexagonal animal enumeration 
problem (which is equivalent to enumerating polycyclic aromatic hydrocarbons), 
and the characterization of the eigenvalue spectra, especially the occurrence of 
degeneracies, of the many polynomials used in bonding theory and other chemical 
applications. Such polynomials include the characteristic polynomial, the match- 
ing polynomial, the chromatic polynomial, the distance polynomial, random walk 
counting polynomials, the sextet polynomial, the permanental polynomial, and 
the polynomial associated with the Ising model partition function (Trinajstic¢ et al. 
1986). These and other outstanding problems have been described and listed by 
Rouvray (1985) and Trinajsti¢ et al. (1986). The challenges of characterizing 
branching in molecular species (Rouvray 1988), the computer perception of 
molecular symmetry (Randié et al. 1981), and the description of molecular 
similarity based on metric spaces (Herndon and Bertz 1987, Johnson and 
Maggiora 1990) are examples of other problems that need to be addressed. It 
would scem with all these problems (and the many others not mentioned here 
because of space limitations) that both chemists and combinatorialists will have 
more than enough to keep them occupied for at least another century. This brings 
to mind the words of the mathematician Sylvester first stated in 1878: “There is a 
wealth of untapped mathematical potential contained in the patient and long 
investigations of our chemist fellows”. 


References 


Ash, LE., PRA. Chubb, S.F. Ward, S.M. Welford and P. Willet 
[1985] ‘ommunication, Storage and Retricval of Chemical Information (Fllis Norwood, Chichester). 
Balaban, A.T. 
" [1976]  ed., Chemical Applications of Graph Theory (Academic Press, London). 
[1985] Applications of graph theory in chemistry, . Chem. Inform, Comput. Sci. 25, 334-343. 
[1987] Annulenes, Benzo-, Hetero-, Homo-Derivatives and their Valence Isomers (CRC Press, Boca 
Raton, FL). ‘ 
Balasubramanian, K. 


[1985] Applications of combinatorics and graph thcory to spectroscopy and quantum chemistry, Chem. 
Rev. 85, 599-618. 


1978 D.H. Rouvray 


Benstock, }.D., D.J. Berndt and K.K. Agarwal ! 
{1988} Graph embedding in SYNCHEM2, an expert system for organic synthesis discovery, Discrete Appl. 
Math. 19, 45-63. 
Bonchey, D. 
[1983] Information Theoretic Indices for Characterization of Chemical Structures (Research Studies Press, t 
Letchworth, England). 
Bonchev, D., and D.H. Rouvray, eds. 
[1991] Chemical Graph Theory: Introduction and Fundamentals (Abacus/Gordon and Breach, London). 
(1992) Chemical Graph Theory: Reactivity and Kinetics (Abacus/Gordon and Breach, London). 
Bonchev, D., D. Ramensi and O.N. Temkin 
[1987] Complexity index for the linear mechanisms of chemical reactions, J. Math. Chem. 1, 345-388. 
Brocas, J. 
[1986] Double cosets and enumeration of permutational isomers of fixed symmetry, £ Amer. Chem. Soc. 108, 
1135-1145. ; 
Brocas, J., M. Gielen and R. Willem 
[1984] The Permutational Approach to Dynamic Stereochemistry (McGraw-Hill, New York). 
Canfield, E.R., R.W. Robinson and D.I. Rouvray 
[1985] Determination of the Wicner molecular branching index for the general tree, / Comput. Chem. 6, ' 
598-609. 
Cayley, A. 
[1875] On the analytical forms called trees, with application to the theory of chemical combinations, Rep. 
Brit. Assac. Adv. Sci. 45, 257-305. 
Corey, E.J. 
[1967] General methods for the construction of complex molecules, Pure Appl. Chem. 14, 19-37. 
Coulson, C.A., and G.S. Rushbrooke 
{1940] Note on the method of molecular orbitals, Proc. Cambridge Philos. Soc. 36, 193 200. 
Crum Brown, A. , 
[1865] On the use of graphic representations of chemical formula, Proc. R. Soc. Edinburgh 5, 429 430. 
Cvetkovic, D.M., M. Doob, I. Gutman and A. Torgasev 
[1988] Recent Results in the Theory of Graph Spectra (North-Holland, Amsterdam). 
Davidson, R.A. 
[1981] Isomers and isomerizations: elements of Redfield’s combinatorial theory, J. Amer. Chem. Soc. 103, 
312-314. 
Dias, J.R. 
{1985] Properties and derivation of the fourth and sixth coefficients of the characteristic polynomial of 
molecular graphs, Theor. Chim. Acta 68, 107-123. : 
(1987] Handbook of Polycyclic Hydrocarbons, Part A (Elsevier, Amsterdam). 
Golender, V.E., and A.B. Rozenblit 
[1983] Logical and Combinatorial Algorithms for Drug Design (Research Studies Press, Letchworth, 
England). 
Gordon, M., and J.W. Kennedy 
[1973] The graph-like state of matter. H. LCG! schemes for the thermodynamics of alkanes and the theory 
of inductive inference, J Chem. Soc. Faraday Trans. II 69, 484-504. 
Graovac, A., 1. Gutman and N. Trinajsti¢ 
(1977] Topological Approach to the Chemistry of Conjugated Molecules, Lecture Notes in Chemistry, Vo). 
4 (Springer, Berlin). 
Gutman, [., and O.1:. Polansky 
[1986] Mathematical Concepts in Organic Chemistry (Springer, Berlin). 
asselbarth, W. 
(1987 The inverse problem of isomer enumeration, J. Comput. Chem. 8, 700 717. 
Heilmann, O.J., and £.H. Lieb 
[1972] Theory of monomer dimer systems, Comm. Math. Phys. 25, 190. 232. 


Combinatorics in chemistry 1979 


Herndon, W.C., and S.H. Bertz 
{1987} Linear notations and molecular graph similarity, J. Comput. Chem. 8, 367-374. 
Hosoya, H. 

{1985} Topological index as a common tool for quantum chemistry, statistical mechanics, and graph theory, 
in: Mathematics and Computational Concepts in Chemistry, ed. N. Trinajstié (Ellis Horwood, 
Chichester), pp. 110-123. 

Johnson, M.A., and G.M. Maggiora 

(1990] eds., Concepts and Applications of Molecular Similarity (Wiley-Interscience, New York). 
Kennedy, J.W., and L.V. Quintas 

[1988]  eds., Applications of Graphs in Chemistry and Physics (North-Holland, Amsterdam). 
Kier, L.B., and L.H. Hall 

[1986] Molecular Connectivity in Structure-Activity Analysis (Research Studies Press, Letchworth, 
England). 

King, R.B. 

(1969] Chemical applications of topology and group theory. |. Coordination polyhedra, A Amer. Chem. 
Soc. 9b, 72U1 7216. 

{1983] ed., Chemical Applications of Topology and Graph Theory (Elsevier, Amsterdam). 

[1988] Topological aspects of polyhedral isomerizations, in: Advances in Dynamic Stereochemistry, ed. 
M. Gielen (Freund Publishing House, Tel Aviv, Isracl) pp. 1-36. 

King, R.B., and D.H. Rouvray 
(1987}]  eds., Graph Theory and Topology in Chemistry (Elsevier, Amsterdam). 
Klein, D.J. 
[1986] Chemical graph-thcoretic cluster expansions, /nt. J. Quant. Chem., Quant. Chem. Symp. 20, 153--171. 
Klein, D.J., and M. Randic¢ 
(1990]  eds., Methods of Mathematical Chemistry (Baltzer, Basel). 
Klemperer, W.G. 
{1972} Enumeration of permutational isomerization reactions, J. Chem. Phys. 56, 5478-5489. 
Knop, J.V, W.R. Miller, K. Szymanski and N. Trinajsti¢ 

[1985] Computer Generation of Certain Classes of Molecules (Assoc. Chem. Technol. of Croatia, Zagreb, 
Yugoslavia). 

{1990} Use of small computers for large calculations: Enumeration of polyhex hydrocarbons, £ Chem. 
Inform. Comput. Sci. 30, 159-160. 

Koa, J. 
[1989] A mathematical model of realistic constitutional chemistry. A synthon approach. I. An algebraic 
model of synthon, J. Math. Chem. 3, 73-89. 
Lindsay, R.K., B.G. Buchanan, E.A. Feigenbaum and J. Lederberg 
[1980] Applications of Artificial Intelligence for Organic Chemistry (McGraw-Hill, New York). 
Lloyd, E.K. : 

[1988] Redfield’s papers and their relevance to counting isomers and isomerizations, Discrete Appl. 

Math. 19, 289-304. 
Mallion, R.B. 

(1982) Some chemical applications of the eigenvalues and eigenvectors of certain finite, planar graphs, in: 
Applications of Combinatorics, ed. R.J. Wilson (Shiva Publishing, Nantwich, Cheshire, England) pp. 
87-114. 

Mallion, R.B., and D.H. Rouvray 

(1990] The golden jubilee of the Coulson—Rushbrooke Pairing Theorem, J. Math. Chem. 5, 1-21. 
Maruani, J., and J. Serre 

[1983]  eds., Symmetries and Properties of Non-Rigid Molecules (Elsevier, Amsterdam). 
Masavetas, K.A. 

{1988} Mathematical properties common in all mechanism models of chemical reactions, Math. Comput. 
Model. 10, 263 274. 

Merrificld, R.E., and HE. Simmons 

[1989] Topological Methods in Chemistry (Wiley-Inlerscience, New York). 


1980 D.H. Rouvray 


Milner, PC. 
[1964] The possible mechanisms of complex reactions involving consecutive steps, J. Electrochem. Soc. 
HI, pp. 228-232. 
Pierce, T.H., and B.A. Honne 
[1986]  eds., Artificial Intelligence Applications in Chemistry (American Chemical Society, Washing- 
ton, DC). 
Polya, G. 
[1937] Kombinatorische Anzahlbestimmungen fiir Gruppen, Graphen und chemische Verbindungen, Acta 
Math. 68, 145-254. 
Polya, G., and R.C. Read 
[1987] Combinatorial Enumeration of Groups, Graphs and Chemical Compounds (Springer, New York). 
Pospichal, J., and V. Kvasni¢ka 
[1990] Graph theory of synthons, /nt. J Quant. Chem, 38, 253-278. 
Randi¢é, M., G.M. Brissey and C.L. Wilkins 
[1981] Computer perception of topological symmetry via canonical numbering of atoms, Chem. Inform. 
Comput. Sci. 21, 52--59. 
Redfield, J.11. 
{1927} The theory of group-reduced distributions, Amer. J. Math. 49, 433-455. 
Rouvray, D.H. 
{1974} Isomer enumeration methods, Chem. Soc. Rev. (London) 3, 355-372. 
[1975] Some reflections on the topological structure of covalent molecules, J. Chem. Educ. 52, 768 -773. 
[1976] The topological matrix in quantum chemistry, in: Chemical Applications of Graph Theory, ed. 
A.T. Balaban (Academic Press, London) pp. 175-221. 
[1985] The role of the topological distance matrix in chemistry, in: Mathematics and Computational 
Concepts in Chemistry, ed. N. Trinajsti¢ (Ellis Horwood, Chichester), pp. 295-306. 
[1986] Predicting chemistry from topology, Sci. Amer. 254(9), 40-47. 
[1987] The modeling of chemical phenomena using topological indices, J. Comput. Chem. 8, 470-480. 
{1988} - The challenge of characterizing branching in molecular species, Discrete Appl. Math. 19, 317-338. 
{1989} The limits of applicability of topological indices, J. Mol. Struct. Theochem 185, 187-201. 
[!990a] The origins of chemical graph theory, in: Mathematical Chemistry, Vol. 1, eds. D. Bonchey and 
DAL Rouvray (Gordon and Breach, London) pp. 1-30. 
(1990b) ed., Computational Chemical Graph Theory (Nova Science, New York). 
Rouvray, D.H., and A.T. Balaban 
[1979] Chemical applications of graph theory, in: Applications of Graph Theory, eds. R.J. Wilson and 
L.W. Beineke (Academic Press, London) pp. 177--221. 
Ruch, E., and DJ. Klein 
[1983] Double cosets in chemistry and physics, Theor, Chim. Acta 63, 447 472. 
Ruch, E., W. Hasselbarth and B. Richter 
{1970] Doppelnebenkiassen als Klassenbegnff und Nomenklaturprinzip fir Isomere und ihre Abzahlung, 
Theor. Chim. Acta 19, 288 300. 
Sellers, PH. 
{1967] Algebraic complexes which characterize chemical networks, SIAM J. Appl. Math. 15, 13. 68. 
[1984] Combinatorial classification of chemical mechanisms, SIAM J. Appl. Math, 44, 784-792. 
(1989] Combinatorial aspects of enzyme kinetics, in: Applications af Combinatorics and Graph Theory to 
the Biological and Social Sciences, ed. F, Roberts (Springer, New York) pp. 295-314. 
Simon, J. 
[1987] _ A topological approach to the stereochemistry of nonrigid molecules, in: Graph Theory and Topology 
in Chemistry, eds. R.B. King and DAL Rouvray (Hlsevier, Amsterdam) pp. 43 75. 
Slanina, Z. 
[1986] Contemporary Theory of Chemical lsomerism (Reidel, Dordrecht). 
Stankevich, M.I., I.V. Stankevich and N.S. Zefirov 
[1988] Topological indices in organic chemistry, Russ. Chem. Rev. 57, 191-208. 


Combinatorics in chemistry 1981 


Sy!vester, J.J. 
{4878] On an application of the new atomic theory to the graphical representation of the invariants and 
covariants of binary quantics, Amer. J. Math. 1, 64-125. 
Trinajstic, N. . 
[1983] Chemical Graph Theory, two volumes (CRC Press, Boca Raton, FL); Second edition: 1992, one 
volume. . 
{1988] The characteristic polynomial of a chemical graph, / Math. Chem. 2, 197-215. 
Trinajstié, N., DJ. Klein and M. Randié 
[1986] On some solved and unsolved problems of chemical graph theory, Int. J. Quant. Chem., Quant. 
Chem. Symp. 20, 699-742. 
Wang, T., I. Burnstein, M. Corbett, S. Ehrlich, M. Evens, A. Gough and P. Johnson 
[1986] Using a theorem prover in the design of organic synthesis, in: Artificial Intelligence Applications in 
Chemistry, eds. TH. Pierce and B.A. Hohne (American Chemical Society, Washington, DC). 
Zivkovié, T.P. 
[1990] On the evaluation of the characteristic polynomial of a chemical graph, £ Comput. Chem. 11, 
217 222. 


CHAPTER 39 


Applications of Combinatorics to Molecular | 
Biology 


Michael S. WATERMAN 


Departments of Mathematics and Molecular Biology, University of Southern California, Los Angeles, 
CA 90089-1113, USA 


Contents 
UcIntrodtiction:.. jo j/s.).449 60 eo eds dated Sa gine Ge alate apne fold ly Lede be SER HLS eae 1985 
2. ‘Séquence alignments: «v5.0 !s05 vs sides One eee oa Aan eee e SAS age ee MOE RES 1986 
2.1. Shufflés and alignment... 0:5: 0.6 lola a heeds eae ee boy ee hae aed ee ives 1987 
2.2: Sequéncesalignment . 20... «3.2 228. Uae Behe Vata a a dled Os Geen sees Pea eels 1988 
3.. Secondary ‘StrUctre 26 0:06 keke ee dai ee ES OEE Pe Wb es Ve Ae dew Eee ee 1990 
3.1. Prediction of secondary structure . 2.0.6.0... eee cee tenet eens 1991 
3.2. Counting secondary structures... tt eee e nee 1992 
4 Maps: oF DNAS s, £655 scssysted ase toca Byte Deeg oO Wg Se MS BE ORS, Ad cap Sytans omen Rens Sa Sys 1993 
4.1. Maps as interval praphs 2.6.2.6. ee ene tees 1994 
4.2. Constriicting Maps? so o5 3.) Se dad aa eee bw ee Oa ae eee asa deueereieaoye BETES 1995 
4.2.1. Simulated annealing .. 2.0.00... 5c c cece cece eee nee eee ne tees 1996 
4.2.2. Multiplicity of solutions 2... 000.666. eee eee 1997 
4.2.3. Computational complexity... 0.0.6.6. cee ete ee eee eee 1998 
RREICTE NCES ogg cera cise ecbeay yee a Es call Sport ang fe aaa eda e OS ME ipo oh age ee SP SEG 1999 


This work was supported by grants from the National Institutes of Health, the National 
Science Foundation and the System Development Foundation. 


HANDBOOK OF COMBINATORICS 


Edited by R. Graham, M. Grétschel and L. Lovasz 
© 1995 Elsevier Science B.V. All rights reserved 


1983 


1. Introduction 

Combinatorics in molecular biology 1985 The biological sciences have undergone a revolution in 
the last dozen years. Al-most every edition of a major newspaper reports some new discovery in 
biology, often with medical and/or financial implications. Biologists now have the ability to 
rapidly read and manipulate DNA, the basic material of life that makes up chromosomes and is 
the carrier of genetic information. The reading of DNA is called sequencing, since the scientists 
are determining the linear sequence of bases along the DNA molecule. The bases or alphabet of 
DNA is adenine (/t), guanine (G), cytosine (C), and thymine (T). These bases, joined to a sugar- 
phosphate backbone, are linked together in a chain to form DNA. Frederick Sanger and Walter 
Gilbert independently developed procedures for the rapid sequencing of long segments of DNA 
molecules. They received the Nobel Prize in 1980 for their discoveries. Sanger, incidentally, was 
earlier the first to determine the amino acid sequence of a protein, insulin. 

Rapid DNA sequencing has caused an information explosion. It was only in 1953 that a 
complementary double-helical structure was postulated for DNA. By 1975 only a few hundred 
bases had been sequenced. In Spring 1994 DNA sequences are collected in international 
databases and sequences totaling about 200 million bases are known. These sequences come 
from various locations in the genomes of a wide variety of organisms. (A genome holds all the 
genetic information of an organism.) The sequences vary greatly in size. A long continuous 
sequence that has been determined to date is that of human cytomegalovirus which is 229354 
bases long. 

Early in this century, Fisher, Haldane and Wright did fundamental work in proving that the 
Mendelian model of genetics, with discrete alleles, is rich enough to generate the seemingly 
continuous range of phenotypes observed in nature. This might seem almost trivial in light of 
today’s emphasis on discrete mathematics, but it was by no means obvious at that time. Their 
mathematical work in population biology led experimental biology. Today mathematical scientists 
lag far behind the experimental biologists as they read the basic material of the gene and directly 
test hypotheses about the nature of life. There has developed a small field of mathematical and 
computer scicnccs to assist the molccular biologist in his endeavor. Most of this mathematical 
development is about discrete structures. See Waterman (1989). 

Increasing attention is being given to the mathematical and computational aspects of molecular 
biology because of the human genome project. This project can be viewed as directed toward 
sequencing all the DNA of humans and other organisms. While 2(H) million bases of DNA have 
been sequenced in pieces that average about 1000 bases long, the genome of even the 
bacterium E. coli is about 5 million bases. Man has a genome of 3 billion bases. Presently, the 
efforts center on improving the mapping and sequencing technology so that such sequencing 
projects can be more easily accomplished. Even so, using today’s technology, genomes of the size 
of those of E. coli will be sequenced within the next few years. A number of analytical problems 
are concerning people who are involved in these studies. 

First of all, the puzzles of assembling map and sequence information from the experimental 
results are large, combinatorial problems. In addition, the vast quantity 


1986 M.S. Waterman 


of data will severely tax our current methods for finding the relationships between 
the sequences that are determined 

In this chapter some combinatorial aspects of molecular biology will be explored. 
Section 2 discusses sequence alignments where certain sequence relationships are 
studied, both by enumeration and algorithms. The next section gives some results 
on enumeration and algorithms for secondary structures. The final section treats 
restriction maps of DNA and their relationship with the human genome project. 
The emphasis is on the description and straightforward solution of some of the 
related problems. Recently this general area has become increasingly active. 


2. Sequence alignments 


Evolution is a key concept in biology. To understand living organisms, biologists 
study the relationships between the organisms and their environments. Important 
inventions, such as the eye, are maintained and improved on throughout history. 
When these concerns of understanding the how and why of biology are brought 
to a molecular level, the evolutionary mode of thinking is extremely important. 
Certain machinery such as that involved in DNA to protein translation (the genetic 
code) is present in all organisms and works everywhere in essentially the same way. 
These mechanisms are so basic to life and so much additional biological activity 
depends on them that they cannot be modified except in very minor ways. 

Other more recent ‘inventions’ at the molecular level allow us to understand 
the difference between life forms in terms of their history. For example, organisms 
with a nucleus (such as humans) are classified as eukaryotes while those without a 
nucleus (such as FE. coli) are classified as prokaryotes. Finer and finer distinctions 
can be made, and classifying organisms goes hand in hand with understanding how 
they function. 

Something of the same approach is taken by biologists in performing DNA 
sequence analysis. Given a sequence x, what known sequences are related to it 
and what are the relationships? Before this question can be explored we need to 
understand what evolutionary events can take place during sequence evolution. 
The simplest event is substitution, where one nucleotide is replaced by another, as 
when A is replaced by C for example. Nucleotides can be inserted into or deleted 
from a sequence, either one nucleotide at a time or in blocks. Insertions and 
deletions greatly complicate the analysis. Inversions and duplications of a block of 
sequence make things even more difficult. 

It is common in molecular biology to try to discover the function of a DNA 
or protein sequence by relating it to other sequences. Frequently this means a 
biologist will compare a sequence with a large number of previously analyzed 
sequences; the comparison is done using a computer using algorithms as described 
below. These comparisons are usually done with sequences taken two at a time. 
Often there are families of related sequences where any pair might have a fairly 
weak relationship. Therefore there is a good deal of interest in comparison of more 
than two sequences, often in comparison of several hundred sequences. 
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In most sequence analysis the sequence transformations are restricted to substi- 
tutions, insertions or deletions. The biologist represents his findings in an alignment 


of one sequence written over another, and the sequence transformations can be 
read from the alignment. For example, 


ATTA-CGG 
-CGACC-G 


is an alignment of ATTACGG with CGACCG. From the point of view of taking 
the top sequence as the “original” sequence, this alignment shows the events in an 
evolution of x to y. There has been the deletion of an A and a G, the substitution 
of C and G for the two 7”s, and the insertion of a C. There is no history recorded 
in an alignment, since there is no information about the timing of the events 
relative to One another nor is it known which sequence “came first”. In fact, some 
other sequence is likely to have been the ancestor of both sequences. In the next 
section we consider some combinatorics motivated by considering the history of 
the events, then we discuss sequence alignment combinatorics and algorithms. 


24. Shuffles and alignments 


Let x = xjx2---x, and y = y,y2---y» be two sequences. The problem under con- 
sideration here is to count the histories for a special type of evolution: delete all 
the letters of x and insert all the letters of y. The deletion/insertion events take 
place one letter at a time, the events can be performed in any order and it is possi- 
ble to track each nucleotide. Thus for simplicity it is assumed that all 1 + m letters 
are distinct. The results described in this section are from Greene (1988) where 
material of independent combinatorial interest also appears. While this is a very 
special case of molecular evolution, the possibie histories between two sequences 
are of much biological interest. Greene’s work is the first mathematical study of 
this complex problem. 

Define an order by s < ¢ if s is a subsequence of ¢. Let {s} denote the set of 
letters in the sequence s, and s|f denote the sequence s restricted to the set {#}. 
A sequence s is on a path between x and y if {s} C {x} U {y} with sly <x and 
sly < y. This set of sequences is denoted W(x, y) and was noted by Greene to be 
shuffles of subsequences of x and y. If we maintain the idea of going “from” x “to” 
y, there is a natural partial order on W(x, y): 


s|x 2 tx, 
sx. tif > sly < ely, 
s|t = dls. 


All sequences of the same length have the same order structure so we set 
W (x, y) = W(n, m). For any n and m, W(n,m) is a lattice. 

There are some natural combinatorial questions about W(n,m) such as deter- 
mining Q(n,m), the number of clements. Greene answers virtually all of these 
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questions. We set C,,,,, equal to the number of maximal chains in W(n,m) and 


define 
ann (5) (Je 


j20 


This last function is closely related to the Jacobi polynomials and both of the 
quantities of interest can be expressed in terms of it. 


Qn om =20" Pr m(/4), 
Cun i (n + m)!@, (1/2), 


Many interesting cases remain to be studied. Simply changing one sequence to 
another by deleting all letters of one sequence and inserting all letters of another 
is not realistic biology. The extension of Greene’s work to allow matching and 
mismatching letters remains to be made; it is likely to be extremely difficult. 


2.2. Sequence alignment 


In this section we will consider alignment of x = x,x2---x, and y = y,y2---yn. The 
sequences are the same length to avoid non-essential complications of the results. 
Both algorithms and combinatorics for alignment are easy if no insertions and 
deletions are allowed. There are simply 2n + 1 ways to align the sequences, one 
over the other. Each of these alignments can be evaluated for quality of matching 
by direct examination of the overlapping portions. Therefore the best alignments 
can be found in O(n?) time. While this chapter is not intended to be a survey 
of algorithms for alignments, this area is discussed as it is of great importance 
in biology. Also, it motivates some useful combinatorics. Insertions and deletions 
can be included in sequence alignments and best alignments can still be located 
in O(n) time. Reviews of the field have appeared in Kruskal and Sankoff (1983) 
and in Waterman (1984, 1989). 

An alignment can be viewed as a way to extend the sequences to be of the same 
length L, equal to the overall length of the alignment. The alignment shown above 


ATTA-CGG 
—CGACC-G 


has length L = 8. Note that the alphabet for the extended sequences has been 
increased by the symbol “—”. 

We now turn to asymptotics for the number of alignments of two sequences of 
length n. The first results for this problem related the number of alignments to 
the Stanton-Cowan numbers (Laquer 1981). One way to count alignments is to 
identify aligned pairs CG? and simply to choose subsets of x and y to align. This 


gives 
X(t) = Cn") 
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alignments if x has 7 letters and y has m letters. Recent work has generalized these 
results. Biologists find an alignment more convincing when the matched segments, 
that is segments without insertions or deletions, occur in larger blocks. Let g(b, 1) 
be the number of alignments where the matched sections are of length at least b. 
The following appears in Griggs.et al. (1986): 

For b > 1 define 


h(x) = (1 xy — ieee —x4+1)y 
and let p = min{x: h(x) = 0}. Then 
g(bn)~ (yn'?)p"  asn co, 


where y, = (p’ — p+1)(—aph'(p))~'/2. The proof uses generating functions for 
g(b,n). We remark that the result of Laquer (1981) is given by the above result 
with b = 1. 

Next are some results on f(k,m), the number of alignments of k sequences of 


length n (Griggs et al. 1990). Using combinatorial argument to give the exponential 
growth rate: 


For fixed k > 2, 
lim In(f(k, n))/n = In(cy), 


where c, = (2'/* — 1)-*. It is also possible to show that the asymptotic behavior 
of c, is equivalent to that of 2~'/2(In 2) KKK. 

Employing a saddle point method gives more precise asymptotics for f(k, 7). 
For fixed k > 2 let r = (2'/* — 1)*. Then 


f(k,n) = [rn & Dy fetal Ey ae I % ony] 


From the asymptotics given here it is clear that it is not possible to just look 
at all possible sequence alignments and pick the preferred ones. It is necessary to 
define an objective function for “good” alignments. Suppose a function s(a, b) is 
given to score the alignment of a and b from the sequence alphabet, and that the 
problem is to find the highest scoring alignments. This score is given by 

Sy) ~ all aieanenie De Sj 9j)s 
ES ESB 
where x} and yj are the jth members of the extended sequences. 

A simple dynamic programming method can be used to find the maximum scor- 
ing alignment. 

Let x = x4x2---xX, and y = yiy2--- Yn. Set Soy = Dy cpcj 8(-s Ye), Soo = 9, Sio = 
iene; Sk, —), and S,j = S(x1x2° ++ xi, yiy2++- yj). Then SQ, y) = S,, and 


Sit, +5 —); 
S,j = max¢ S; ry Et SOG, Ys); 
Sij i +5(-, yj). 
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The above algorithm aligns two sequences in O(n?) time and space. Letters 
are inserted or deleted in blocks in biology. For general weighting of these “gaps” 
the dynamic programming algorithm has time O(") (Waterman 1986), while linear 
weighting retains time O(n’). It can be argued that the weighting should be concave 
where the comparisons can be made in almost the same time. See Waterman (1989), 
Miller and Mycrs (1988) and Galil and Giancarlo (1989). 

For the case of k sequences of length n the simple algorithm generalizes to 
require O(2‘n*) time and space. This is computationally impractical and several 
different approaches have been taken to solve this important problem; see Wa- 
terman (1986) and Waterman and Jones (1990). Some recent approaches to this 
important problem are now described. 

Carrillo and Lipman (1988) consider the generalization of dynamic programming 
alignments to k sequences. They observe that the score of the projection of a 
multiple alignment onto two of the sequences cannot be more than the score of 
those two sequences aligned by themselves. They exploit this observation to greatly 
reduce the time and storage of multiple sequence alignment. As many as 9 or 10 
sequences might be aligned by their technique. 

Another approach to k-sequence alignment is to build up the multiple alignment 
from two sequence alignments. It is obviously possible to begin with the best- 
aligned sequence pairs and obtain an unsatisfactory result in the end, but some 
groups have made useful algorithms based on this approach. In Waterman and 
Perlwitz (1984) some connections with geometry are explored. Taylor (1987) and 
Vingron and Argos (1989) have excellent programs along these general lines. 

Finally in Waterman (1986) and Waterman and Jones (1990) a different approach 
is taken. The algorithm matches short words of set length and degree of mismatch. 
The words can be matched within a fixed amount of position offset and total score 
is maximized where a score is given to each matching word. 


3. Secondary structure 


When RNA is transcribed from the DNA template, it is single-stranded. That is, 
RNA docs not possess a matching or sclf-complementary strand to pair with it. The 
single-stranded molecule can fold back on itself and when regions of the molecule 
are complementary they can become double-stranded or helical. The pairing rules 
for the sequences are analogous to those for DNA except that T becomes U 
(uracil) in the RNA alphabet: A pairs with U and G pairs with C. In addition, 
frequently G is thought to pair with U. Biologists call the two-dimensional self- 
pairing secondary structure. Without reference to an actual RNA sequence it is 
an interesting problem to enumerate the distinct secondary structures that are 
possible under various restrictions suggested by biology. The number of structures 
for a sequence of length n satisfies a recursion related to the Catalan numbers. 
There is even a vector recursion that is “Catalan-like”. : 

Next, a definition of secondary structure is given. Label 1 points on the x-axis: 
1,2,...,n. The points correspond to the nucleotide sequence of the RNA. Choose 
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a subset of 2j points, 0 < 2j <n. The 2j points are arranged into / disjoint pairs 
and the pairs are connected by ares, subject to the following conditions: 

1. Adjacent points are never connected by an arc. 

2. Any two points connected by an arc must be separated by at least m points. 

3. Arcs cannot intersect. . 

Condition 2 comes from restrictions on the bending of the sugar—phosphate 
backbone. In RNA m = 3 or 4 is realistic. Condition 3 comes from eliminating 
structures with “knotted” loops. There are a few examples in biology where con- 
dition 3 is violated and no combinatorics has yet been done for those cases. 

In the next section, computer prediction of secondary structures for RNA se- 
quences is bricfly discussed. As was the case for sequence alignment, the associated 
dynamic programming algorithms are closely related to enumeration of the con- 
figurations. 


3.1. Prediction of secondary structure 


Several attempts were made on the secondary structure “problem” before dynamic 
programming was first proposed. The basic problem is to find the minimum free- 
energy structure where negative free energy is assigned to the base pairs and 
positive energy is assigned to end loops, unpaired bases in helical regions, and so 
on. The energy rules are not too well understood. The subject is reviewed in Zuker 
and Sankoff (1984) and here a very simple version of the problem is solved: find 
the secondary structures that have the maximum number of base pairs. 


Theorem 3.1. Let x = x1X2---XxX, be a sequence over {A,C,G,U}, 1 <m, and p: 
{A,C, G, U} x {A,C, G, U} — {0,1}. Define F (i,j) = maximum number of base 
pairs of all secondary structures over x;---x;, where a pair can be formed if and 
only if p(-,-) = 1. Set F(i, j) = 0 whenever j <m+ i. Then 


F(i, j) = max {F(i,j — 1), (FG, k - 1) + Fk + 1,7 ~ 1) + Up(, x); 
T+kimc< yj}. 


Proof. The proof of the recursion is based on the observation that either x; is 
unpaired or it is paired with a base x,. To satisfy the constraints, m <j —k—-1. 
The boundary conditions simply reflect the fact that no pairs can form unless the 
constraint is satisfied. O 


The recursion can be performed in O(n’) time and space. Unfortunately the 
structures predicted by this algorithm usually do not correspond to those known to 
exist in nature and more complicated algorithms must be employed. A very useful 
algorithm has been devised, again based on dynamic programming, that takes time 
O(n3) and O(n?) space (Zuker and Sankoff 1984). This method employs a shortcut 
and until recently no polynomial-time solution was known for the general problem. 
In Watermand and Smith (1986) a general solution was given that takes time O(v*) 
and space O(n). Since sequences of interest are often 5000 long and range up to 
20000, there is a need for more work in this area. See Galil and Giancarlo (1989). 
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3.2. Counting secondary structures 


Let S,(m) = 5S, be the number of secondary structures possible for a string or 
sequence of length n. For this discussion the structures need only satisfy the con- 
ditions stated above — no sequence-specific pairing is considered. The results are 
taken from Stein and Waterman (1978) and Howell et al. (1980). 


Theorem 3.2. For 1 < m, Sy = 8S) = +++ = Sy_1 =9 and S,, = 1 are boundary val- 
ues. Then 


Siosj = Smoj1 + Smaj-24---+Sj-1+ D> SiSmnsj—2-i 
O<icmiy -2 
Proof. The proof is similar to the proof of the algorithm given for Theorem 3.1. 
Consider adding the base m + j. If base m+ j does not pair we have S,,,;_) struc- 
tures. Otherwise the base pairs with a base with subscript i+ 1 from 1 to m+ j —2 


and the number of structures is the product of the possibilities from the strings 


1---i and i+2---m+j-—1. The boundary conditions give the recursion in the 
above form. 0 


If we set m = 0 and consider the above recursion, 
S(O) = Sa = Spi + Ss SjSn-2-j 
O<j<n-2 


where S,,(0) = 1. This recursion generates the Motzkin numbers (Sloan 1973) and 
they have an explicit solution: S,(0) = do js9 ¢/+1 Gj). This formula is a consequence 
of Theorem 3.3 below, where we define the Catalan numbers c;,, by 


Cit = G an iy a) 


Next define the convolved Fibonacci numbers f,,(r,k) by 


(ata? si ty he So far k)x". 
nod 
Th- 
fa(m+1,2j+1)x"._ 
se 
NOS : . : 5 
—% ‘aring the generating function, using the re- 
o ‘ons. Next asymptotics for S,, are presented. 
; rem of Bender (1974). 
as 
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Theorem 3.4. Define F(r,s) = r?s? —(1—~r—r?—-+-—r")s 47. Letr > 0,5 > 
So be the unique real solutions of the system F (r,s) = 0, Fy(r,s) = 0. Then 


Sn ~ (Fr, 8)/(27Fyy(r,8))) 2 nr", 


The following special cases hold: 
1. S(O) ~ 3/4) 3723", 
2. Sa(1) ~ 4/15 + 7V5)/(82)n79/2((3 + V5) /2)". 


3. Sn(2) ~ of (1 + V2)/arn 3/21 + V2)". 


The behavior of S,,(m) is governed by r(m)” and r(m) can only be numerically 
determined for m > 3. Still, it can be shown that r(sm) is monotonically increasing 
and r(m) > 1/2 as m — oo. 

Next a closer examination of S,,(s) = S, is made. Set 


R" = (r.r,73,---) 


where rj = | and r? is the number of secondary structures with i base pairs for a 
sequence of length n. Also define a * b = c by 


Ce = DS aid. i, 


O<i<k 

and let ¢(ao, a1,...) = (0, a0, @,...). 

Theorem 3.5. Set Ro = Ri = R2 = (1,0,0,...). Then for n > 2, 
Ri = Rat > oR)-1* Rn). 


1<jgn-] 


Proof. There can be no pairs for n < 2, so the boundary conditions hold. Next, 
the number of structures with i + 1 pairs for a sequence of length n + 1 is derived. 
If base n +1 is unpaired, 57, structures exist with i + 1 pairs. If instead base n+ 1 
is paired with base j, then to have i +1 pairs i additional pairs are needed. If k 
pairs exist for bases 1 to j — 1, then i ~ k pairs must exist for bases j+1ton. @ 


4. Maps of DNA 


For many years maps of the relative location of genes on chromosomes have been 
constructed by a technique known as linkage analysis. Before 1980 it was required 
that the genes have observable mutations available as genetic markers. Since fruit 
flies have many visibly distinguishable mutants, the relative locations of many of 
the corresponding genes have been mapped. Bacteria and yeast have also been 
extensively mapped. In 1980 it was realized that measurable changes in the DNA 
itself can be used for genetic mapping (Botstein et al. 1980). The changes are in the 
lengths of restriction fragments and the resulting mapping techniques have been 
very important in localizing genes associated with major diseases. 
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Site-specific restriction enzymes were discovered in 1970 in bacteria (Nathans 
and Smith 1975). These enzymes cut double-stranded DNA at the locations of short 
specific patterns, usually from four to six letters in length. The restriction enzyme 
Hhal cuts at GCGC while EcoRI cuts at GAATTC. Mutations at a single letter 
of DNA can cause the appearance or disappearance of a restriction site. It is easy 
to see that insertions and deletions of a segment of DNA can also cause variation 
in the restriction sites. The fragment lengths then become the genetic markers for 
linkage analysis. This is a quite active research area and there is currently some 
mathematical activity in devising efficient algorithms (Lander and Botstein 1986). 

In linkage analysis, map distance might not relate linearly to physical distance 
or number of bases. Soon after the discovery of site-specific restriction enzymes 
biologists learned to construct another type of map known as a restriction map. 
In restriction maps all the enzyme sites are approximately located on the DNA. 
Such maps usually cover a few thousand bases of DNA but much longer stretches 
of DNA have been mapped (Isono et al. 1987). This section will discuss in some 
detail the difficulties in restriction map construction. First, restriction maps are 
related to interval graphs. 


4.1. Maps as interval graphs 


Interval graph theory began with Benzer’s study of the structure of genes in bac- 
teria (Benzer 1959). Benzer was able to obtain data on the overlap between pairs 
of fragments of DNA from a gene. He was successful in arranging the overlap 
data in a way that implied the linear nature of the gene. Soon after this, Fulkerson 
and Gross (Golumbic 1980) studied interval graphs and incidence matrices; that 
study is closely related to Benzer’s analysis. Today the linear nature of the gene is 
well established but interval graphs also arise in connection with restriction maps 
(Waterman and Griggs 1986). 

Representing an interval of DNA as a line segment, the biologist indicates the 
location of restriction sites along the line segment. Circular DNA does occur in 
nature but this discussion is restricted to the linear maps of two restriction enzymes. 
Next the two restriction enzymes are designated by A and B. Figure 1a gives an 
example of a two-enzyme A/B map while the single-enzyme maps appear in fig. 
1b. Biologists are able to measure the lengths but not the order of the intervals 
between sites, so they are labeled arbitrarily. The intervals are called restriction 
fragments and form the nodes of our graphs. 

Label the ith fragment from enzyme A (B) by A; (B;). Define the incidence 
matrices 1(A, B), [(A,A/B), and I(B, A/B) where, for example, /(A,B),; = 1 if 
A; 1B; £9 and 0 otherwise. It is sometimes experimentally feasible to determine 
I(A, A/B) and I(B, A/B). It is then a simple matter to compute /(A, B). 


Proposition 4.1. Jf I” is the transpose of 1, then 
I(A, B) = (A, A/B)I"(B, A/B). 


Proof. The result follows from the observation that the (é,/)th element of the 
matrix product is equal to the number of A/B intervals in both A; and Bj. O 
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SS a ep ae a (a) 


(b) 


B B 
Figure I. The two enzyme A/B map (a) and the single enzyme (A and B) maps (b). 


Constructing a restriction map from /(A, B) is equivalent to finding an interval 
representation of a bipartite graph G(A, B) defined in a natural way. The vertex 
set V(A, B) is the union of the set of A intervals with the set of B intervals; the 
edge set E(A, B) consists of those sets {A;, Bj} where A; B; 4 @. If we delete 
the endpoints of the fragments from the line segment, we obtain an open-interval 
representation of G(A, B). Restriction maps can be characterized by known results 
on interval graphs (Golumbic 1980). 


Theorem 4.2. The following are equivalent: 

1. G(A, B) is a bipartite graph constructed from a restriction map. 

2. G(A, B) is a bipartite interval graph with no isolated edges. 

3. [(A, B) can be transformed by row and column permutations into a staircase 
form with each row or column having 1’s in precisely one of the steps. 


With the identification of G(A, B) as an interval graph, it is routine to adapt a 
general algorithm of Booth and Leuker to recognize G as an interval graph and 
to give its representation in linear time (Booth and Leuker 1976, Waterman and 
Griggs 1986). As is the case in many problems in biology, the overlap data is usually 


given with errors. Then the problem of finding the “interval graph” becomes much 
harder. 


4.2. Constructing maps 


It is experimentally possible to apply restriction enzymes singly or in combination, 
and to estimate the lengths of the resulling fragments of DNA. The problem is 
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to construct the map of location of the enzyme sites along the DNA from this 
fragment-length data. The results are from Goldstein and Waterman (1987). 


4.2.1. Simulated annealing 

Here we consider the simplest problem of interest that involves linear DNA, two 
restriction enzymes, and no measurement error. We will refer to this problem as 
the double-digest problem or problem DDP. A restriction enzyme cuts a piece of 
DNA of length L at all occurrences of a short specific pattern and the lengths 
of the resulting fragments are recorded. In the double-digest problem we have as 
data the list of fragment lengths when each enzyme is used singly, say, 


A = {a;: 1 <i <n from the first digest}, 


B={b;: 1 <i <_m from the second digest}, 


as well as a list of double-digest fragment lengths when the restriction enzymes 
are used in combination and the DNA is cut at all occurrences specific to both 
patterns, say 


C={o7l Stain 


only length information is obtained. In general A, B, and C will be multisets; that 
is, there may be values of fragment lengths that occur more than once. We adopt 
the convention that the sets A, B, and C are ordered; that is, a; < a; for i < j, 
and similarly for the sets B and C. Of course 


5 a= > b= i q=L, 


i<icn l<igm {<i<ny2 


since we are assuming that fragment lengths are measured in number of bases with 
no errors. 


Given the above data, the problem is to find orderings for the sets A and Bsuch 
that the double-digest implied by these orderings is, in a sense made precise below, 


C. This is a mathematical statement of a problem originally solved by exhaustive 
search. 


The double-digest problem can be stated more precisely as follows. For permu- 


tations o € (12---n), wp € (12---m), call (o, #) a configuration. By ordering A and 
B according to o and yp, respectively, the set of locations of cut sites is obtained: 


S= Csis= > Qg(j) OTS = s bypyiO9Kr<gn,OKgtgm 
l<j<r l<jxr 


The set S is not allowed repetitions; that is, S is not a multiset. Now label the 
elements of S such that 


S={sj;0<j<m 2} with s; <5; fori <j. 
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The double-digest implied by the configuration (o, 4) can be defined by 
C(o, nw) = {ei(o, n): ci(o, w) = 8; — 5), for some 1 <j < m3}, 


where it is assumed as usual that the set is ordered in the index i. The problem 
then is to find a configuration (o, 42) such that C = C(o, #). As discussed below, 
this problem lies in the class of NP-complete problems conjectured to have no 
polynomial-time solution. 

In order to implement a simulated annealing algorithm, an energy function and 


a neighborhood structure are required. The energy function is a chi-square-like 
function 


flo,n)= > (cilo,n) — 6)?/e. 


1<i<my2 


Note that if all measurements are free of error then f attains its global minimum 
value of zero for at least one choice (a, ~). Following Goldstein and Waterman 
(1987), we define the set of neighbors of a configuration (a, 4.) by 


N(o,u) = {(t,n): 7 € N(o)} U{(o,0): » € N(u)}, 


where N(p) are the neighbors used in studies of the travelling salesman problem 
(Bonomi and Lutton 1984). 

With these ingredients, the algorithm was tested on exact, known data from 
the bacteriophage lambda with restriction enzymes BamHI and EcoRI, yielding a 
problem size of |A|! x |B|! = 6!6! = 518 400. See Daniels et al. (1983) for the com- 
plete sequence and map information about lambda. Temperature was not lowered 
at the rate c/log(n) as suggested by the theorem in Geman and Geman (1984), but 
for reasons of practicality was instead lowered exponentially. On three separate 
trials using various annealing schedules the solution was located after 29 702, 6895, 
and 3670 iterations from random initial configurations. 


4.2.2. Multiplicity of solutions 


In many instances, the solution to the double-digest problem is not unique. Con- 
sider, for example, 


A = {1,3,3, 12}, 
B= {1,2,3,3, 4,6, }, 
C = {1,1,1,1,2,2,2,3, 6}. 
This problem, of size 4!6!/2!2! = 4320, admits 208 distinct solutions. That is, 


there are 208 distinct orders which produce C. We now demonstrate that this 
phenomenon is far from isolated. 


Below, we use the Kingman subadditive ergodic theorem to prove that the num- 
ber of solutions to the double-digest problem increases exponentially as a function 
of length under the probability model stated below. 
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For reference, a version of the subadditive ergodic theorem is given here (King- 
man 1973). For s,é non-negative integers with 0 < 5 <1, let X,, be a colicction of 
random variables which satisfy 

1. whenever s <t <u, Xu < Xs + Neus 

2. the joint distribution of {X,,} is the same as that of {Xsu1 ee}: 

3. the expectation g, = E|Xy,| exists and satisfies g, > --Kt for some constant 
K and allt > 1. 

Then the finite lim,... Xo./t = A exists with probability one and in the mean. 


Theorem 4.3. Assume the sites for two restriction enzymes are independently dis- 
tributed with cut probabilities p,, p2 respectively and p; € (0,1). Let Ys, be the 
number of solutions between the sth and the tth coincident sites. Then there is a 
constant X > 0 such that 


Proof. Let a coincidence be defined to be the event that a site is cut by both re- 
striction enzymes; such an event occurs at each site independently with probability 
Pip2 > 0, and at site 0 by definition. On the sites 1,2,3,..., there will be an infi- 
nite number of such events, For s,u = 0,1,2,...,0<s <u we may consider the 
double-digest problem for only that segment located between the sth and uth co- 
incidences. Let Y,,, denote the number of solutions to the double-digest problem 
for this segment. 

Suppose s <¢ <u. A solution for the segment between the sth and ¢th coinci- 
dences and a solution for the segment between the fth and wth coincidences can be 


combined to yield a solution for the segment between the sth and wth coincidences. 
Thus 


Ysiu 2 Ysa Yiw- 


We note that the inequality may be strict as Y,,, counts solutions given by orderings 
where fragments initially between, say, the sth and ftth coincidences now appear 
in the solution between the rth and uth coincidences. Letting 


Xst aio log Yse5 


we have s <¢ <u implies X,. < X50 + Xiu: 
Additional technical details can be established to complete the proof of the 
theorem. 0 


4.2.3. Computational complexity 

Given the definition of a restriction map as permutations of the various digest 
fragments, it is no surprise that the double-digest problem is NP-complete. See 
Garey and Johnson (1979) for definitions. 


Theorem 4.4. The double-digest problem is NP-complete. 
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Proof. It is clear that the DDP described above is in the class NP, as a nondeter- 
ministic algorithm need only guess a configuration (o, ») and check in polynomial 
time if C(a, #2) = C. The number of steps to check this is in fact linear. To show 
that DDP is NP-complete, the partition problem is transformed to DDP. 

In the partition problem, known to be NP-complete (Garey and Johnson 1979), 
a finite set Q, say |Q| = 2 is given along with a positive integer s(q) for cach q € Q, 
and we wish to determine whether there exists a subset Q’ C Q such that 


So sa)= 32 s(q). 


ag 140 0 


If bar og 5(q) — J is not divisible by two, there can be no such subset Q’. Otherwise, 
input to problem DDP the data 


A= {s(ay)i lok <n}, 
B= {J /2,J/2}, 
C = {s(q):q € Q}. 


It is clear that any solution to problem DDP with this data yields a solution to 


the partition problem through the order of the implied digest C. Therefore DDP 
is NP-complete. 0 


Biologists have routinely been solving DDPs for inexact length measurements. 
They of course are generally unaware of the results presented here, and search for 
the length and enzymes that allow a solution. To contribute usefully to this field 
the challenge is to find algorithms that will extend their capabilities. 
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It is not surprising that computer science is perhaps the most important field of ap- 
plications of combinatorial ideas. Modern computers operate in a discrete fashion 
both in time and space, and much of classical mathematics must be “discretized” 
before it can be implemented on computers as, for example, in the case of numer- 
ical analysis. The connection between combinatorics and computer science might 
be even stronger than suggested by this observation; each field has profited from 
the other. Combinatorics was the first field of mathematics where the ideas and 
concepts of computer science, in particular complexity theory, had a profound im- 
pact. This framework for much of combinatorics has been surveyed in chapter 29. 
In this chapter, we illustrate what computer science profits from combinatorics: we 
have collected a number of examples, all of them rather important in computer 
science, where methods and results from classical discrete mathematics play a cru- 
cial role. Since many of these examples rely on concepts from theoretical computer 
science that have been discussed in chapter 29, the reader is encouraged to refer 
to that chapter for background material. 


1. Communication complexity 


There are many situations where the amount of communication between two pro- 
cessors jointly solving a certain task is the real bottleneck; examples range from 
communication between the earth and a rocket approaching Jupiter to communica- 
tion between different parts of a computer chip. We shall see that communication 
complexity also plays an important role in theoretical studies, and in particular, in 
the complexity theory of circuits. Other examples of such indirect applications of 
communication complexity include bounds on the depth of decision trees (Hajnal 
et al. 1988) and pseudorandom number generation (Babai et al. 1989). Communica- 
tion complexity is a much simpler and cleaner model than computational or circuit 
complexity, and it illustrates notions from complexity theory like non-determinism 
and randomization in a particularly simple way. Our interest in this field also stems 
from the many ways in which it relates to combinatorial problems and methods. 
This section gives just a glimpse of this theory; see Lovasz (1990) for a broader 
survey. 

Suppose that there are two players, Alice and Bob, who want to evaluate a 
function f(x,y) in two variables; for simplicity, we assume that the value of f is 
a single bit. Alice knows the value of the variable x, Bob knows the value of the 
variable y, and they can communicate with each other. Local computation is free, 
but communication is costly. What is the minimum number of bits that they have 
to communicate? 

We can describe the problem by a matrix as follows. Let a,,...,ay be the pos- 
sible inputs of Alice and b,,...,by, the possible inputs of Bob. Note that since 
local computation is free, we need not worry about how these are encoded. Let 
¢ij = f(a;,b;). The 0-1 matrix C = (¢;)" 1, determines the communication prob- 
lem. Both players know the matrix C; Alice also knows the index i of a row, Bob 
knows the index j of a column, and they want to determine the entry c;;. 
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To solve their task for a particular matrix C, Alice and Bob, before learning 
their inputs @ and j, agrec in advance on a protocol, which is the communication 
analoguc to the fundamental notion of an algorithm in computational complexity 
theory. Informally, a communication protocol is a set of rules specifying the order 
and “meaning” of the messages sent. The protocol prescribes each action for Alice 
and Bob: who is to send the first bit; depending on the input of that processor, 
what this bit should be; depending on this bit, who is to send the second bit, and 
depending on the input of that processor and on the first bit sent, what this second 
bit should be, and so on. The protocol terminates when one processor knows the 
output bit and the other one knows this about the first one. The complexity of a 
protocol is the number of bits communicated in the worst case. 

The trivial protocol is that Alice tells her input to Bob. We shall see that some- 
times there is no better protocol than this trivial one. This protocol takes [log, N] 
bits; if M < N then the reverse trivial protocol is clearly better. (For the remainder 
of this chapter, we shall use log to denote log,.) 

A protocol has the following combinatorial description in terms of the matrix. 
First, it determines who sends the first bit; say, Alice does. This bit is determined by 
the input of Alice; in other words, the protocol partitions the rows of the matrix 
C into two classes, and the first bit of Alice tells Bob which of the two classes 
contains her row. From now on, the game is limited to the submatrix C, formed 
by the rows in this class. Next, the protocol describes a partition of either the rows 
or columns of C; into two classes, depending on who is to send the second bit, and 
the second bit itself specifics which of these two classes contains the line (row or 
column) of the sender. This limits the game to a submatrix C2, and so it continues. 

If the game ends after & bits then the remaining submatrix C, is the union of an 
all-1 submatrix and an all-0 submatrix. We shall call this an almost-homogeneous 
matrix. If, for example, Alice knows the answer, then her row in C, must be all-0 
or all-1, and since Bob knows for sure that Alice knows the answer, this must be 
true for every row of Cx. 

We can therefore characterize the communication complexity by the following 
combinatorial problem: given a 0-1 matrix C, in how many rounds can we partition 
it into almost-homogeneous submatrices, if in each round we can split each of the 
current submatrices into two (cither horizontally or vertically). We shall denote 
this number by «(C). 

If rk(C) denotes the rank of C, then the following inequality, observed by 


Mehlhorn and Schmidt (1982), relates the communication complexity to the rank 
of the matrix. 


Lemma 1.1. Over any field, logrk(C) < K(C) < rk(C). 


(The lower bound is tightest if we use the real field, whereas the upper bound 
might be tightened by considering a finite field.) 

In particular, we obtain that if C has full row rank then the trivial protocol is 
optimal. This corollary applies directly to a number of communication problems, 
of which we mention three. Suppose that Alice knows a subset X of an n-element 
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set S and Bob knows a subset Y of S. The equality problem is to decide if the 
two subsets are equal; the disjointness problem is to decide if the two subsets 
are disjoint; the binary inner product problem is to decide if the intersection of 
the two sets has odd cardinality. In the first two cases, it is trivial to see that 
the corresponding 2” x 2” matrices have full rank; in the third case, the rank of 
the matrix is 2” — 1. Hence the trivial protocols are optimal. (It is interesting to 
remark that in the third case, the rank over GF(2), which might seem more natural 
to use in this problem, gives a very poor bound here: the GF(2) rank of C is 
only 7.) 

The lower bound in the lemma is often sharp; on the other hand, no commu- 
nication problem is known for which «(C) is anywhere near the bound rk(C). 
In particular, it is open whether «x(C) can be bounded by any polynomial of 
log rk(C). Spieker and Raz (1994) constructed an example where «(C) is not linear 
in log rk(C), in fact, K(C) > logrk(C) - log log log rk(C). 

To point out that the situation is not always trivial, consider again the equality 
problem. Recall that if N denotes the number of inputs, then the trivial protocol 
takes [log N] bits, and this is optimal. 

In contrast, Freivalds (1979) obtained the very nice result that if Alice and 
Bob tolerate errors occurring with probability 2 ', then the situation changes 
drastically. Consider the following protocol, which can be viewed as an extension of 
a “parity check”. Treat the inputs as two natural numbers x and y,0 < x,y <N—1. 
Alice selects a random prime p < (log N.)?, computes the remainder x’ of x modulo 
p, and then sends the pair (x’, p) to Bob. Bob then computes the remainder y’ of 
y modulo p, and compares it with x’. If they are distinct, he concludes that x # y; 
if they are the same, he concludes that x = y. 

If the two numbers x and y are equal then, of course, so are x’ and y’ and so 
the protocol reaches the right conclusion. If x and y are different, then it could 
happen that x’ = y’ and so the protocol reaches the wrong conclusion. This happens 
if and only if p is a divisor of x ~ y. Since |x ~- y| < N, it follows that x —y has 
fewer than log N different prime divisors. On the other hand, Alice had about 
(log N)? /2loglog N primes from which to choose, and so the probability that she 
chose one of the divisors of x — y tends to 0. For N sufficiently large, the error 
will be smaller than 2 '. Clearly, this protocol uses at most 4loglog N bits. 

Randomization need not lead to such dramatic improvements for all problems. 
We have seen that the binary inner product problem and the disjointness problem 
behave quite similarly to the equality problem from the point of view of determin- 
istic communication complexity: the corresponding matrices have essentially full 
rank and hence the trivial protocols are optimal. But, unlike for the equality prob- 
lem, randomization is of little help here: Chor and Goldreich (1988) proved that 
the randomized communication complexity of computing the binary inner product 
problem is Q(n). Improving results of Babai et al. (1986), Kalyanasundaram and 
Schnittger (1987) showed that the randomized communication complexity of the 
disjointness problem is O(n). The proofs of these facts combine the work of Yao 
(1983) on randomized communication complexity with rather involved combina- 
torial considerations. 
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Just as in computational complexity theory, non-determinism plays a crucial role 
in communication complexity theory. Non-deterministic protocols were introduced 
by Lipton and Sedgewick (1981). Perhaps the best way to view a non-deterministic 
protocol is as a scheme by which a third party, knowing both inputs, can convince 
Alice and Bob what the value of f is. Again, we want to minimize the number 
of bits such a certificate contains for the worst inputs. For example, if in the dis- 
jointness problem the two subsets are not disjoint, announcing a common element 
convinces both players that the answer is “no”. This certificate takes only [log n| 
bits, which is much smaller than the number of bits Alice and Bob would need to 
find the answer, which is n, as we have seen. On the other hand, if the sets are 
disjoint, then no such simple non-deterministic scheme exists. We shall distinguish 
between the non-deterministic protocol for the “0” and for the “1”. In both cases, 
there is always the trivial protocol that announces the input of Alice. 

To give a formal and combinatorial description of non-deterministic protocols, 
consider a non-deterministic protocol for “1”, a particular certificate p, and those 
entries of C (i.e., inputs for Alice and Bob) for which this certificate “works”. If 
the proof scheme is correct, these must be all 1’s; from the fact that Alice and 
Bob have to verify the certificate independently, we also see that these |’s must 
form a submatrix. Thus, a non-deterministic protocol corresponds to a covering of 
C by all-1 submatrices. Conversely, every such covering gives a non-deterministic 
protocol: one can simply use the name of the submatrix containing the given entry 
as a certificate. The number of bits needed is the logarithm of the number of dif- 
ferent certificates used: the non-deterministic communication complexity «,(C) of 
a matrix C is the least natural number ¢ such that the 1’s in C can be covered by at 
most 2! all-1 submatrices. One can analogously define xo(C), the non-deterministic 
communication complexity of certifying a 0. 

Note that the all-1 submatrices in the covering need not be disjoint. Therefore, 
there is no immediate relation between rk(C) and «;(C). It is easy to formulate the 
non-deterministic communication complexity of a matrix as a set-covering problem: 
consider the hypergraph whose vertices are the 1’s in C and whose edges are the 
all-1 submatrices; «,(C) is the logarithm of the minimum number of edges covering 
all vertices. An immediate lower bound on «;(C) follows from a simple counting 
argument: if C has a 1's but each all-1 submatrix of C has at most b entries, then 
trivially, k)(C) > log a — log b. We can also consider the natural dual problem: if 
a is the maximum number of 1’s in C such that no two occur in the same all-1 
submatrix, then «,(C) > loga. 

Returning to the three problems on two sets mentioned above, we see that 
the 1’s in the 2” x 2” identity matrix cannot be covered by fewer that 2” all-1 
submatrices; hence the non-deterministic communication complexity of equality is 
n. Similarly, the non-deterministic communication complexity of set disjointness 
is n, since a = 2". For the binary inner product problem, it is easy to see by 
elementary linear algebra that an all-1 submatrix has at most 2" ' entries, and that 
exactly 2”—'(2” — 1) entries are 1’s. This gives that at least 2" — 1 all-1 submatrices 
are needed to cover the I’s, and so «,(C) = 2. For this matrix, a similar argument 
also shows that Ko(C) = 7. 
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We have derived the following lower bound on the communication complexity 
of a matrix: 


max {log rk(C), Ko(C), «1(C)} 2 K(C). 


The identity matrix shows that «x, might be a very weak bound: «(C) can be 
exponentially large compared with «;(C). Interchanging the roles of 1 and 0, we 
obtain that ky(C) might also be very far from «(C). No such example is known 
for logrk(C), but it is likely that the situation is similar. 

However, it is a surprising fact that the product of any two of these three lower 
bounds is an upper bound on the communication complexity. The first part of the 
following theorem is due to Aho et al. (1983b), and the other two, to Lovasz and 
Saks (1993). 


Theorem 1.2. For every matrix C, 
(a) K(C) < (Ko(C) + Iai (C) +1); 
(b) «(C) < 1+ [logrk(C) }(Ko(C) + 2); 
(c) K(C) < 1+ [log(rk(C) + 1) }(«1(C) + 2). 


We shall sketch a result that is stronger than each of (a), (b) and (c). Let p,(C) 
denote the size of the largest square submatrix of C such that, after a suitable 
reordering of the rows and columns, each diagonal entry is 1 but each entry above 
the diagonal is 0. It is clear that p;(C) < rk(C) and p,(C) < 2"). If we define 
po(C) analogously, then po(C) < rk(C) + 1. By using these inequalities, we can 
obtain all three parts of Theorem 1.2 from the following result. 


Theorem 1.3. «(C) < 1+ [log ,(C)| (xo(C) + 1). 


Proof. We use induction on p;(C). If p;(C) = 1 then trivially «(C) < 1. Assume 
that p,(C) > 1 and let k = «(C); then the 0 elements of C can be covered by all-0 
submatrices C),...,C, where / < 2‘. Let A; denote the submatrix formed by those 
rows of C that meet C;, and let B; denote the analogous submatrix formed from 
columns of C. Observe that 


pi(A;) + pi (Bj) < pC), f= 1,...,8 (1.1) 


this will play a crucial role in the following protocol. 

First, Alice looks at her row to see if it intersects any submatrix C; with p,(Ai) < 
p\(C)/2. If so, she sends a “1” and the name / of such a submatrix. They have 
thereby reduced the problem to the matrix A,. If not, she sends a “0”. 

When Bob receives this “0”, he looks at his column to see if it intersects any 
submatrix C; with p;(B;) < p:(C)/2. If so, he sends a “1” and the name i of such 
a submatrix, They have thereby reduced the problem to the matrix B;. If not, he 
sends a “0”. : 

If both Alice and Bob failed to find an appropriate submatrix, then by (1.1), the 
intersection of their lines cannot belong to any C;, and so it must be a “1”. They 
have found the answer. 0 
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Theorem 1.2(a) has an interesting interpretation. Define a communication prob- 
lem as a class 3€ of 0-1 matrices; for simplicity, assume that they are square ma- 
trices. The communication complexity of any N x N matrix is at most log N. We 
say that # is in Poomm if it can be solved substantially better: if there exists a con- 
stant c > 0 such that «(C) < (loglog N)° for each N x N matrix C € #. Similarly, 
we say that # is in NPcomm if there exists a constant c > 0 such that for each 
CE #, w(C) < (loglog N)*°, and define co-NP comm analogously based on xo(C). 


Just as for the analogous computational complexity classes, we have the trivial 
containment 


Poomm CNP comm VCONP comm: 


However, for the communication complexity classes we also have the following, 
rather interesting facts: 


Poomm # NP comms 
NP comm # CO-N P comm; 

which follow from the equality problem, but 
Peomm = NP comm CON P comms 


by Theorem 1.2(a). This idea was developed by Babai et al. (1986), who defined and 
studied communication analogues of many other well-known complexity classes 
such as #P, PSPACE and BPP. 

Our previous examples were trivial from the point of view of communication: 
the trivial protocols are optimal. This is quite atypical. Let us define a round in the 
protocol as a maximal period during which one player sends bits to the other. The 
trivial protocol consists of one round. Let «*(C) denote the minimum number of 
bits needed by a communication protocol with at most k rounds. Halstenberg and 
Reischuk (1988) proved that for every k > 1, there exist arbitrarily large matrices 
C such that «*(C) is exponentially larger than «**'(C). 

We consider two combinatorial examples to illustrate that protocols can be sig- 
nificantly more efficient than the trivial ones. Yannakakis (1988) considered the 
following problem: Alice and Bob are given a graph G on n nodes, Alice is given 
a stable set A and Bob is given a clique B; they must decide whether these sets 
intersect. The corresponding matrix C has rank 7 but size exponential in 7, and so 
the trivial protocol takes (n) bits. (Recall that we always focus on the worst-case 

@iedamolexitv) ~*~ r hand, the non-deterministic communication complexity 

n, since the name of a common node of A and B is a 

2, K(C) < (2+ logn). (The number of rounds in the pro- 
known whether the deterministic complexity, or even 
lexity of disjointness, is smaller, such as O(log7). It is 
he latter question is equivalent to the following purcly 
there a constant c > 0 so that in each graph G on n 
ch that each pair of disjoint sets U,V, where U is a 
is separated by one of these cuts. 
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In the subtree disjointness problem, there is a tree T known to both players. 
Alice gets a subtree T, and Bob gets a subtree Ty, and their task is to decide 
whether 7, and 7 are node-disjoint. It can be shown that the corresponding 
matrix C has rank 2n but. has exponential size. The non-deterministic complexity 
of non-disjointness is logn, and hence by Theorem 1.2, «(C) < (2 + logn)?. In this 
case, one can do better by using the following simple protocol: Alice sends any 
node x of her tree to Bob. If y is the node in Bob’s tree that is closest to x, Bob 
responds by sending y to Alice. Then Alice checks if y € Ta: if so, then clearly 
the subtrees are not disjoint; if not, the subtrees are disjoint. This protocol uses 
2logn bits. Lovdsz and Saks (1993) showed that it can be modified to use only 
log 1 + log log n bits. 

An interesting and rather general class of communication problems, for which 
good bounds on the complexity can be obtained by non-trivial combinatorial 
means, was formulated by Hajnal et al. (1988). Let (¥,A,V) be a finite lattice. 
Assume that Alice is given an element a € £ and Bob is given an element be &, 
and their task is to decide whether a A b = 0, the minimal element of the lattice. 

This problem generalizes both the disjointness problem (where & is a Boolean 
algebra), and the subtree disjointness problem (where & is the lattice of subtrees of 
a tree). A third special case worth mentioning is the following spanning subgraph 
problem: Alice is given a graph Ga and Bob is given a graph Gg on the same set 
of nodes V; they wish to decide whether Ga U Gg is connected. This case relies on 
the lattice of partitions of V, but “upside down” so that the indiscrete partition is 
0. Alice can compute the partition a of V into the connected components of Ga, 
Bob can compute the partition b of V into the connected components of Gr, and 
then they decide whether a A b = 0. 

For a given lattice ¥, let C be the matrix associated with the corresponding 
problem: its rows and columns are indexed by the elements of 2%, and c;; = 1 if and 
only if i A j = 0. To find the rank of this matrix, we give the following factorization 
of it, using the Mobius function » of £ (see chapter 21 for the definition and some 
basic properties). Let Z = (z;;) be the zeta-matrix of the lattice, ie., let z;; = 1 if 
and only if i<j (i,j € £). Let D = (d;;) denote the diagonal matrix defined by 


dj; = (0,i). Then it is easy to verify the following identity, found by Wilf (1968) 
and Lindstrém (1969) (see chapter 31): 


C=Z"'DZ. (1.2) 
Since Z is trivially non-singular, this implies that 
rk(C) = rk(D) = |{é: 4(0,4) # O}. (1.3) 


This gives a lower bound on the communication complexity of our problem, but 
how good is this bound? One case when this bound is tight occurs when (0,7) 4 0 
for all i. We obtain a lower bound of [log|#|], which is also the upper bound 
achieved by the trivial protocol; that is, the trivial protocol is optimal. By Corollary 
3.10 of chapter 21, this case occurs when & is a geometric lattice or a geometric 


lattice “upside down”. In particular, we see that for the spanning subgraph problem, 
the trivial protocol is optimal. 
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It turns out that the lower bound given by logrk(C) is not too far from the truth 
for any lattice. 


Theorem 1.4. For every lattice £, 
log rk(C) < «(C) < (1 + logrk(C))’. 


Proof (upper bound). Observe that a non-deterministic certificate of non-disjoint- 
ness of two elements a,b € ¥ can be provided by exhibiting an atom of the lattice 
below both a and 6. Hence the logarithm of the number of atoms is an upper 
bound on Kp(C). Since for every atom i, 4(0,7) = —1, it follows from the identity 
(1.3) that the number of atoms is at most rk(C). Henec, the upper bound follows 
directly from Theorem 1.2. Of 


2. Circuit complexity 


One promising approach to proving lower bounds on the computational complex- 
ity of a problem focuses on the Boolean circuit model of computation, and recent 
results in this area are possibly the deepest applications of combinatorial methods 
to computer science thus far. The best way to view a circuit is not as an abstract 
electronic device; instead, view it as the bit-operational skeleton of any computa- 
tional procedure. This way, it is not hard to sce that this model is equivalent ta 
other models such as the Turing machine or RAM, and that the number of func- 
tional elements, or gates, in a circuit is cquivalent to the time taken by an algorithm 
in those models. (More precisely, a RAM algorithm, for example, is equivalent to 
a family of Boolean circuits, one for each input length.) As a result, the extremely 
difficult task of proving lower bounds on the computational complexity of a given 
problem can be posed in a way much more suited to combinatorial methods. Many 
people believe that this is the direction of research that may eventually lead to 
the solution of famous problems, such as ¥ vs. NY. Unfortunately, the handful 
of results obtained ai this point are rather difficult, and yet quite far from this 
objective. 

Let us recall some definitions from chapter 29. A Boolean circuit is an acyclic 
directed graph; nodes of indegree 0 are called input gates, and are labelled with 
the input variables x,,...,xn; nodes with outdegree 0 are called output gates; every 
node with indegree r > 0 is called a functional gate, and is Jabelled with a Boolean 
function in r variables, corresponding to the predecessors of the node. For our 
purposes, it suffices to allow only the logical negation, conjunction, and disjunction 
as functions. The number of gates in a circuit is called its size; the circuit complexity 
of a problem is the size of the smallest circuit that computes this Boolean function. 
The outdegree and indegree of a node are referred to as its fan-out and fan-in, 
respectively. 

Another important parameter of a circuit is its depth, the maximum length of a 
path from an input gate to an output gate. A circuit can also be viewed as paraflel 
algorithm, and then the depth corresponds to the (parallel) time that the algorithm 
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takes. Note that every Boolean function can be computed by a Boolean circuit of 
depth 2: this is casily derived from its conjunctive normal form. Of course, the 
number of gates, which is essentially the number of terms in this normal form, is 
typically exponential. 

A circuit is monotone if only conjunctions and disjunctions are allowed as func- 
tional gates. Note that every monotone increasing Boolean function can be com- 
puted by a monotone Boolean circuit. 

The predominant approach to proving circuit complexity lower bounds is to re- 
strict the class of allowed circuits. Two kinds of restrictions have proved sufficiently 
strong, and yet reasonably intcresting, to allow the derivation of superpolynomial 
lower bounds on the number of gates: monotonicity and bounding the depth. Two 
main methods seem to emerge: the random restriction method and the approxima- 
tion method. Both methods have applications for both kinds of restricted problems. 

The first superpolynomiat lower bound in a restricted model of computation 
concerned constant-depth circuits. Note that for this class of circuits to make sense, 
we must allow that the gates have arbitrarily large fan-out and fan-in. Furst et al. 
(1981) and Ajtai (1983) proved independently that every constant-depth circuit 
computing the parity function has superpolynomial size; the parity function maps 
(X1,...,;Xn) +? X] +--+ +X,, where here, and throughout this section, addition is the 
mod 2 sum. Yao (1985) established a truly exponential lower bound by extending 
the techniques of Furst et al. Hastad (1989) has further strengthened the bound 
and greatly simplified the proof. All of these proofs are based on probabilistic 
combinatorial arguments; the proof of the following theorem can be found in 
chapter 33. 


Theorem 2.1. If C is a circuit with n input bits and depth d that computes the parity 


function, then C has at least 20/1" ” gates. 


Razborov (1987) gave an exponential lower bound on the size of constant-depth 
circuits that compute another simple Boolean function, the so-called majority func- 
tion, i.e., the function 


_ JJ, if at least [n/2] of the x; are 1, 
Aerie) = { 0, otherwise. 


In fact, he proved a stronger result by allowing circuits that may have parity gates, 
in addition to the usual AND, OR and NOT gates, where a parity gate computes the 
parity of the number of 1’s in its input. The proof uses the approximation method, 
which was first used by Razborov (1985a) in his pathbreaking paper on monotone 
circuits. The later application of the method is perhaps the cleanest, and we are 
able to reproduce the full proof. 


Theorem 2.2. If C is a circuit of depth d, with AND, OR, NOT and parity gates 


that computes the majority function of n input bits, then C has at least an /10/n 
gates. 
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Proof. Consider a circuit that computes the majority function. We can assume 
without Joss of generality that the circuit uses only parity and OR gates, since 
these can be used to simulate both AND and NOT gates within constant depth. 
The idea of the proof is to introduce “approximations” of the gates used during the 
computation. Using the approximate gates, instead of the real gates, one computes 
an approximation of the majority function. The quality of the approximation will 
be measured in terms of the number of inputs on which the modified circuit differs 
from the original. The main point of the approximation is to keep the computed 
{function “simple” in some sense. We wilt show that every “simple” function, and in 
particular the approximation we compute, differs from the majority function on a 
significant fraction of the inputs. Since the approximation of each gate has a limited 
effect on the function computed, we can conclude that many approximations had 
to occur. 

Each Boolean function can be expressed as a polynomial over the two-element 
ficld GF(2). The measure of simplicity of a Boolvan function f for this proof is 
the degree of the polynomial representing the function or for short, the degree of 
the function, 

In fact, the approximation technique is applied not to the majority function, but 
to a closely related function, the k-threshold function f,. This function is 1 when 
at least k of the inputs are 1. It is easy to see that if there is a circuit of size s that 
computes the majority function of 2n — 1 elements in depth d, then, for each k, 1 < 
k <n, there is a circuit of depth d and size at most s that computes the k-threshold 
function on n elements. Therefore, any exponential lower bound for f; implies a 
similar bound for the majority function. We shall consider k = [(a+A+1)/2] for 
an appropriate h. 

First we show that any function of low degree has to differ from the k-threshold 
function on a significant fraction of the inputs. 


Lemma 2.3. Let n/2<k <n. Every polynomial with n variables of degree h= 
2k —n—1 differs from the k-threshold function on at least (() inputs. 


Proof. Let g be a polynomial of degree A and let & denote the set of vectors where 
it differs from f,: Let « denote the set of all 0-1 vectors of length n containing 
exactly k 1’s. 

For each Boolean function f, consider the summation function i (x) = yf): 
{t is trivial to see that the summation function of the monomial x;, ---x;, is 1 for the 
incidence vector of the set {i,,...,é} and 0 on all other vectors. Hence it follows 
that f has degree at most A if and only if f vanishes on all vectors with more than 
h 1’s. In contrast to this, the summation function of the k-threshold f; is 0 on all 
vectors with fewer than k 1’s, but 1 on all vectors with exactly k 1’s. 

Consider the matrix M = (sn,,,) whose rows ate indexed by the members of «f, 
whose columns are indexed by the members of 9%, and 


m, fh ifa>b, 
ab | 0, otherwise. 
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We want to show that the columns of this matrix generate the whole linear space. 
This will imply that |%| > [sd| = ( 

Let a,,@2 € @ and let a; Aa denote their coordinatewise minimum. Then we 
have, by the definition of &, 


Somp= S51 55 (ke) +80) = D> f+ SO glu). 


bxay b<a,Aaz sana; ua, Aaz us<ajAaz 
beB bes 


The second term of this last expression is 0, since a; A a2 contains at least h +1 1’s. 
The first term is also O except if a, = a2. The columns of M therefore generate the 
unit vector corresponding to the coordinate a,, and so they generate the whole 
space. 


If p, and p2 are polynomials representing two functions, then p; + p2 is the 
polynomial corresponding to the parity of the two functions. The polynomial p, pz 
corresponds to their AND, which makes it easy to see that (p; + 1)(p2 + 1) +1 cor- 
responds to their OR. Note that the inputs have degree I, i-e., they are very simple. 
Since the degree is not increased by computing the sum, the parity gates do not 
have to be approximated. On the other hand, unbounded fan-in OR gates can 
greatly increase the degree of the computed functions. We will approximate the 
OR gates so that the approximated function will have fairly low degree. The fol- 
lowing lemma will serve as the basis for the approximation. 


Lemma 2.4. Let g1,.-.,8m be Boolean functions of degree at most h. If r 21 and 


f =Vi", gi then there is a function f' of degree at most rh that differs from f on at 
most 2"”" inputs. 


Proof. Randomly select r subsets [; C uo ...,m} (1 <j <r), where each i is inde- 
pendently included in J; with probability } 3- Let f; be the sum of the g; with i € J;, 
and consider f' = \/; jal fy We claim that the probability that f’ satisfies the require- 
ments of the lemma is non-zero. It is clear that the degree of the polynomial for 
f’ is at most rh. Furthermore, consider an input a; we claim that the probability 
that f'(a) # f(a) is at most 27". To see this, consider two cases. If g;(a@) =0 for 
every i, then both f(a) =O and f(a) = 0. On the other hand, if there exists an 
index i for which g;(@) = 1, then f(a) = 1 and for each j, f;(a) = 0 independently 
with probability at most i. Therefore, f’(a) = 0 with probability at most 27’, and 
the expected number of inputs on which f’ # f is at most 2". Hence for at least 


One particular choice of the sets /;, the polynomial f’ differs from f on at most 
2"”-' inputs. O 


To finish the proof, assume that there is a circuit of size s and depth d to 
compute the k-threshold function for inputs of size a. Now apply Lemma 2.4 with 
r — |n'/24| to approximate the OR gates in this circuit. The functions computed 
by the gates at the ith level will be approximated by polynomials of degree at 
most ri. Therefore, each resulting approximation p, of the k-threshold function 
will have degree at most r4. Lemma 2.3 implies that for k = [(a+r4+1)/2], the 
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polynomial p, differs from the k-threshold function on at least ({) inputs. This 
shows that s2”~" > (;'). From this, routine calculations yield that 


n a 
2 fo eee 
@) ers 77 


which establishes the desired exponential lower bound. 0 


Smolensky (1987) generalized this result to prove that every constant-depth 
circuit that decides whether the sum of the inputs is 0 modulo p using AND, OR 
and modulo-q gates has exponential size, where p and gq are powers of different 
primes. 

How far can one relax the assumption on bounded depth and still obtain su- 
perpolynomial lower bounds? The methods of Yao, Hastad and Razborov can 
be extended to depth near logn/loglogn. One cannot hope for much more, 
since the parity function can in fact be computed by a linear-size circuit of depth 
O(log n/logiogz). 

Perhaps the deepest result on circuit complexity is contained in the ground- 
breaking paper of Razborov (1985a). He gave a superpolynomial lower bound on 
the monotone circuit complexity of the clique problem, without any restriction on 
the depth. Shortly afterwards, Andreev (1985) used similar techniques to obtain an 
exponential lower bound on a less natural W’A-complete problem. Alon and Bop- 
pana (1987), by strengthening the combinatorial arguments of Razborov, proved 


an exponential lower bound on the monotone circuit complexity of the k-clique 
function. 


Theorem 2.5. If C is a monotone circuit with (5) input bits that decides whether a 


given graph on n nodes contains a clique with at least s nodes, then the number of 
gates in C is at least 


: ij (V541)/2 
8 \4s3/21ogn 


The proof uses a much more elaborate application of the approximation tech- 
nique. The main combinatorial tool used is the “sunflower theorem” of Erd6s and 
Rado (see chapter 24). 

How different can the monotone and non-monotone circuit complexity of a 
monotone function be? Pratt (1975) proved that the monotone circuit complexity 
of Boolean matrix multiplication is O(n"). This, together with the O('87) ma- 
trix multiplication technique of Strassen (1969), proves that these two notions are 
distinct. Razborov (1985b), using techniques similar to those used for the clique 
lower bound, showed that the perfect matching problem, which is in ?, has super- 
polynomial monotone circuit complexity, thereby establishing a superpolynomial 
gap. Tardos (1988) showed that this could be increased to an exponential separa- 
tion, by combining the arguments of Razborov (1985a), Alon and Boppana (1987) 
and results of Grétschel et al. (1981) on the polynomial computability of a graph 
function & that is closely related to the clique function (see chapter 31). 
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As remarked above, no methods are known to handle general circuits with depth 
greater than logn. In the case of monotone circuits with fan-in 2, however, a version 
of the random restriction method has been successfully applied by Karchmer and 
Wigderson (1990) to prove a lower bound on the depth proportional to log? n. It is 
clear that a circuit with fan-in 2 and size N must have depth at least log N; hence 
Theorem 2.5 implies that every monotone circuit with fan-in 2 computing the k- 
clique function must have depth n!/3-*, However, the function that Karchmer and 
Wigderson consider is computable by polynomial-size monotone circuit, and so no 
non-trivial bound on the depth is implied by considering only its size. 

Karchmer and Wigderson considered the undirected reachability problem: given 
a graph G and two nodes s and ¢, is there an s —¢ path in G? This problem is 
clearly in #, and in fact, it can be decided by a polynomial-size monotone circuit 
that has depth O(log? n). Karchmer and Wigderson proved the following result. 


Theorem 2.6. There exists a constant c > 0 such that if C is a monotone circuit with 
fan-in 2 that solves the undirected reachability problem for a graph on n nodes, its 
depth is least clog’ n. 


The proof, which uses a version of the random restriction method, is quite 
involved and is not given here. We describe, however, the starting point, which is 
a new characterization of the depth of circuits with fan-in at most 2 in terms of 
communication complexity, thereby establishing a surprising link with the material 
of the previous section. 

Consider the following game between two players. The game is given by a 
Boolean function f inn variables. The first player gets x € {0,1}" such that f(x) = 0 
and the second player gets y € {0,1}” such that f(y) = 1. The goal of the game 
is that the two players should agree on a coordinate i such that x; 4 y;. Let x(f) 
denote the minimum number of bits that the two players must communicate to 
agree on such a coordinate. (For example, the first player could tell x to the sec- 
ond player, and then the second player can find an appropriate coordinate to tell 
to the first player, so «(f) <n+ logan.) Karchmer and Wigderson proved that the 
minimum depth in which a Boolean function f can be computed with a circuit with 
fan-in 2 is equal to «(f). A similar characterization can be given for the monotone 
circuit complexity of monotone Boolean functions. 

In the case of the undirected reachability problem, the corresponding game can 
be phrased as follows: the first player is given an [s,¢]-path and the second player 
is given an [s,¢]-cut, and the goal of the game is to find an edge in the intersection 
of the path and the cut. Consider the following protocol for this connectivity game: 
the first player sends the name of the midpoint on the path, and the second player 
responds by telling on which side of the cut this node lies. This protocol requires 
O(logn) rounds, and in each round the first player sends log” bits and the second 
player sends 1. 

Karchmer and Wigderson prove that even if the second player were allowed to 
send O(n*) bits in each round (instead of just 1), the players would still need at 
least O(log) rounds. The claimed lower bound on the monotone circuit depth is 
a consequence of this fact. 
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3. Data structures 


Imagine a huge science library. It contains a wealth of information, but to make 
this information useful, catalogues, reference and review volumes, indices (and 
librarians) are needed. Similarly, information in the memory of a computer is 
useful only if it is accessible, i.c., it is provided with cxtra structures that make the 
storage, retrieval, and modification of this information possible, and in fact easy. 
This is particularly important when implementing complicated algorithms: the fast 
storage and retrieval of certain partial results of the computation is often the 
main bottleneck in speeding up such algorithms. Such auxiliary structures, called 
data structures, are becoming increasingly important as the amount of information 
stored increases. 

The theory of data structures is very broad and we shall restrict ourselves to 
two examples that illustrate the depth of combinatorial ideas used in this ficld. 


For a more thorough treatment, sce Aho et al. (1983a), Tarjan (1983) and Gonnet 
(1984). 


3.1. Shortest paths and Fibonacci heaps 


Let G = (V(G), E(G)) be a graph with n nodes and m edges, and with a specified 
node s. Let every edge e have a non-negative length c(e). We want to find a 
shortest path from s to every other node. We have seen in. chapter 2 that using 
Dijkstra’s algorithm, this is quite easily done in polynomial (O(n7)) steps. (To be 
precise, we use the RAM machine model of computation, and a step means an 
arithmetic operation — addition, multiplication, comparison, storage or retrieval — 
of numbers whose length is at most a constant multiple of the maximum of logn 
and the length of the input parameters.) Let us review this procedure. 

We construct a subtree T of G, one node at a time. We shall only be concerned 
with the nodes of this tree, so we consider 7 as a set of nodes. Initially, we Ict 
T = {s}. At any given step, for cach v € T we know the length of the shortest 
path from s to uv, i-e., the distance d(s,v). It would be easy to also obtain the edges 
of the tree; then the unique [s,v]-path in this tree realizes this distance. 

The essence of Dijkstra’s algorithm is to find an cdge uw © E(G) with we 7 and 
v € V(G)\ T for which d(s,«) + c(uv) is minimal, and then add v to T. As shown 
in chapter 2, we then have that d(s,v) = d(s,u) + c(uv). The issue is to find this 
edge economically. At first glance, we have to select the smallest member from a 
set of size O(71), and we have to repeat this n times, so this rough implementation 
of Dijkstra’s algorithm takes O(mm) steps. 

We can easily do better, however, by keeping track of some of the partial results 
that we have obtained. Let S denote the set of nodes not in T but adjacent to 7. 
For each node v € S, we maintain the value €(v) = min{d(s, u) + c(uv)}, where the 
minimum is taken over all edges uv with u € T, Foru ¢ SUT, we define £(v) = co 
for notational convenience. Clearly, &(v) is an upper bound on d(s,v). At the 
beginning, @(v) = c(sv) for each neighbor uv of s and @(u) = co for each other node 
u. A step then consists of (a) selecting the node v € S for which é(v) is minimum 
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and setting d(s,v) = @(v), (b) deleting v from § and adding it to T, (c) adding each 
neighbor w of uv that is in V(G)\ (7 US) to S, and (d) updating the value @(w) 
for each neighbor w of v that is in S by 


£(w) = min {€(w), d(s,v) + c(uw)}. 


This way, it takes O(#) steps to select the node vu € S for which ¢(v) is minimum, 
and so the total number of steps spent on selecting these nodes is O(n). There is 
also the time needed to update the values €(w): this is a constant number of steps 
per node, and we have at most d(u) nodes to consider, where d(v) is the degree 
of v. Updating therefore takes O(5>, d(v)) = O(n) steps, which is dominated by 
O(n2); by maintaining the current best path lengths, Dijkstra’s algorithm takes 
O(n’) steps. 

Can we do better? It is natural to assume that we have to take at least mm steps, 
in order to read the data. If m is proportional to n?, the algorithm just described 
is best possible. But for most real-life problems, the graph is sparse, i.c., m is 
much smaller than n’, and then there is space for improvement. To obtain this 
improvement, we shall store the set S, together with the values @(v), in a clever 
way. 

A first idea is to keep the set S sorted in order of the value of @(v). This makes 
it trivial to select the next node v and delete it from S, but to achieve (c) and (d), 
we insert new items in the sorted list, which can be done in O(fogn) steps per 
insertion. However, even this is non-trivial. If we simply store the sorted elements 
of S in an array, i.e., in consecutive fields, then to insert a new element, we expect 
to move half of the old clements for each insertion. Another possibility is to store 
these elements in a list: each element is stored along with a pointer, which specifies 
the memory location of the next element in the list. In this data structure, insertion 
is trivial, but to find the point of insertion, we must traverse the list, which takes 
a linear number of steps. Advantages of both methods can be combined using a 
data structure called a binary search tree. We shall not discuss these in detail, since 
we will show how to do better with another kind of data structure, called a heap. 
Nonetheless, Dijkstra’s algorithm with the current best path lengths stored in a 
binary search tree takes O(mlogn) steps. This may be much better than O(n’), 
but there ts still room for improvement. 

At this point, it is worth while to formulate the requirements of the desired data 
structure in an axiomatic way. We have some objects, the elements of S, which are 
to be stored. Each object has a value £(v) associated with it, which is called its key. 
Reviewing the algorithm, we see that we must perform the following operations 
on this collection of data: 

(3.1) DELETEMIN. We might want to find the element of S with smaflest key 
and delete it from S (steps (a) and (b)). 

(3.2) INSERT. We might want to add new elements to S (step (c)). 

(3.3) DECREASEKEY. We might change the key of an element of S; in fact, 
we only need to decrease it (step (d)). 

Observe that (3.1) and (3.2) are performed O(n) times, since every node is added 
at most once and deleted once, whereas (3.3) is performed O(m) times. 
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As mentioned above, a heap is a data structure that can handle these opcrations 
in logarithmic time. Since DECREASEKEY is performed more often than the 
other two operations, we can improve the overall running time by decreasing the 
cost of performing just this operation. Fredman and Tarjan (1987) showed how to 
do this by using a more sophisticated data structure, called a Fibonacci heap. 

A heap is a rooted tree defined on the elements of S, with the property that the 
key of any node is no more than the key of any of its children. In particular, the 
root is an element with the smallest key. If the tree is a single path, then the heap 
is a sorted list, but it will be worth while to consider more compact trees. 

Before deciding about the shape of the heap, Ict us discuss how to perform the 
tasks (3.1)-(3.3). The heap itself can be realized by maintaining a pointer from 
each non-root node to its parent. Moreover, the children of each node are ordered 
in an arbitrary way, and each child contains a pointer to the previous child as well 
as to the next child. Each parent maintains pointers to its first and last child. Each 
node has five pointers, some of which may be undefined: parent, first child, last 
child, previous sibling, next sibling. Changes in the heap are made by manipulating 
these pointers. 

The most common way to implement operations (3.1)-(3.3) in a heap is as 
follows. DECREASEKEY is perhaps the easiest. Assume that we decrease the 
key of an element po. Let pop;---p, be the path in the tree connecting po to 
the root. If €(po) is still at least €(p,) then we still have a heap; otherwise, we 
interchange the elements po and p,. Next we compare the key of po with the key 
of pz; if (po) < @(p2), we have a heap; otherwise, interchange po and p2, and so 
on. After at most ¢ interchanges we end up with a heap. Note that the tree has 
not changed, only the vertices have been permuted. 

INSERT can be reduced to DECREASEKEY: if we want to add a new element 
w then we can assign to it a temporary key +oo, and make it the child of any 
preexisting element. Trivially, this produces a heap. We can then decrease the key 
of w to the right value, and reorder the elements as before. 

Finally, DELETEMIN can be performed as follows. Let r be the root element 
of the heap. Select any leaf p of the tree and replace r by p. This interchange 
deletes the root, but we do not necessarily have a heap, since the key of p may 
be larger than the key of one or more of its children. Find the child with smallest 
key, and interchange that child with p. It is easy to see that the resulting tree again 
can violate the heap condition only at the node p. If p has larger key than some 
of its children, then find its child with the smallest key and interchange them; and 
so on. Eventually, we obtain a heap. 

The height of a node in a rooted tree is the maximum distance to a leaf; the 
height of the tree is the height of its root. If the tree has height # then the opera- 
tions INSERT and DECREASEKEY take O(h) steps; to make these operations 
efficient, we want the tree as short as possible. But DELETEMIN puts limits on 
this: it also involves O(/) interchanges, but before each interchange, we must also 
find the child with smallest key, and this takes roughly d steps for a node with d 
children. If we use balanced k-ary trees in which, with at most one exception, all 
internal nodes have k children and each leaf is at distance A or A — 1 from the root, 
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then h < log, n and so the total number of steps is O(n log, n + mlog, n + nk log, 7). 
The best choice is k = 2 + (m/n), and this shows that Dijkstra’s algorithm with the 
current best path lengths stored in a k-ary heap takes O(m logn/log(2 + (m/n))) 
steps, which is a slight improvement. 

One can use more sophisticated trees with a more sophisticated implementation 
of the basic operations and with a more sophisticated way to count steps. A rooted 
tree is called a Fibonacci tree if 


for every node v and natural number k, the number of children (3.4) 
of v with degree at most k is at most k +1. : 


(We use the term degree in the graph-thcoretic sense: the degree of a non-root 
node is one larger than the number of its children.) 

The degree of the tree is the degree of the root. We want to build heaps on 
such trees. For technical reasons, it will be convenient to add an artificial element 
r with key —0o; hence r is always the root. Moreover, for the root we shall impose 
the following condition, stronger than (3.4): 


The degrees of the children of r are distinct. (3.5) 


A Fibonacci heap is a heap whose underlying tree is a Fibonacci tree that satisfies 
condition (3.5). We are going to show that by using Fibonacci heaps, we can im- 
plement Dijkstra’s algorithm to run in O(m+nlogn) steps, which is best possible 
for every m > nlogn. For very sparse graphs, this is, in some sense, an optimal 
implementation of Dijkstra’s algorithm, but other algorithms may be better. 

Note that the subtree of a Fibonacci tree formed by a node and its descendants 
is also a Fibonacci tree. If we delete a node and its descendants, then the only node 
where condition (3.4) could be violated is the grandparent of the deleted node. In 
particular, if we delete a child of the root and its descendants, we are left with a 
Fibonacci tree. By applying induction to this observation, we obtain the following 
lemma, which explains the name Fibonacci tree. 


Lemma 3.1. Let Fo =0 and F, = 1. Then the number of nodes in a Fibonacci tree 
of degree k is at least Fy,, the (k+2)nd Fibonacci number. 


It foliows that a Fibonacci tree with n nodes has degree O(log). More generally, 
each node has degree O(log), which follows by considering the Fibonacci tree 
formed by this node and its descendants. 

Assume now that we have a Fibonacci heap, and we want to perform INSERT, 
DELETEMIN and DECREASEKEY operations. In cach case, we will first pro- 
duce a heap that satisfies the Fibonacci property at all non-root nodes; we will 
then give a procedure that restores the stronger property (3.5) at the root. 

(a) INSERT. Add the new node x as a child of r. 

(b) DELETEMIN. Of course, we do not want to delete the root, but the minimal 
“real” element. Find the child of the root with the smallest key, delete it, and make 
its children have the root as their new parent. 

(c) DECREASEKEY. Suppose that we want to decrease the key of a node v;. 
If v; is a child of the root, simply decrease the key. Otherwise, let vjv2---v,r be the 
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path connecting v, to the root. Delete the edge connecting v, to its parent v2, and 
let v; become a child of the root. The resulting tree satisfies the heap condition 
even with the decreased key. Moreover, condition (3.4) is satisfied by all non-root 
nodes except possibly by v3, the grandparent of v,. Until condition (3.4) is satisfied 
at all non-root nodes, fix the violation at v; by making v,_, a child of the root. 
After at most ¢ steps, we obtain a tree that satisfies (3.4) for all non-root nodes. 

(d) For each of the three operations, we finish by restoring property (3.5). To do 
this, we look at the degrees of the root’s children, and assume that two children 
u and v have the same degree ¢. Suppose that the key of u is no more than the 
key of uv; then we delete the edge ur and make v a child of u. We will show that 
property (3.4) remains valid at every non-root. This is trivial for all nodes except u. 
To see that it holds for u, consider any s > 1. If s < ¢ then uv has the same children 
with degree at most s as before, and so their number is at most s + 1. For s 24, u 
now has at most » children altogether. (It will be important that a nearly identical 
argument applies even if a child of vu were delcted!(+)) 

If we find two children of the root with the same degree then we can transform 
the heap into one where (3.4) is still valid at ali non-roots, and the root has lower 
degree. We repeat this until all children of the root have different degrees; then 
(3.4) is trivially satisfied at the root. . 

There are two points to clarify. 

e How do we know in (c) how many edges v,v;,, must be deleted? To check 
condition (3.4) for each v; would take too much time. Instead, we will classify 
each node as either safe or unsafe. If a node is safe, then deleting one of its 
children will not violate (3.4) at a non-root node. In contrast, classifying a node 
as unsafe implies only that we are not sure if such a violation would occur. Thus, 
we can always classify the root and its children as safe. In particular, each newly 
inserted node is safe. We shall reclassify a node in only two cases: 

(i) If an unsafe node becomes a child of the root (in a DELETEMIN or 
DECREASEKEY operation), then it is reclassified as safe. 

(ii) If a safe node different from the root loses a child (in a DECREASEKEY 
operation), then it is reclassified as unsafe. 

It follows that in the DECREASEKEY operation, we delete all edges of the path 
V,v2°+-Ur up to the first safe node vj, and make v,,...,v;.. children of the root. 
We reclassify v; as unsafe, and v,,...,v,;.; as safe. Note that in the parenthetical 
remark (*), we have already indicated that when r gets a new grandchild v in step 
(d), it is correct to still classify v as safe. 

e In performing (d), how do we find those children of the root with the same 
degree? To sort the degrees would be too time-consuming. We wish to perform 
(d) in O(d) steps, where d is the degree of the root after performing steps (a)-(c). 
For each node, we can always store its current degree in an array. We will also 
maintain an array a, where a{k}, k =1,...,d, indicates the name of a child of the 
root of degree k, if one exists. This array is not changed during steps (a)—(c), but 
is updated during step (d) instead. It is trivial to update a to reflect the deletion 
of a child of the root. To update a for the new children of the root added during 
steps (a)-(c), we consider them one at a time. To update a to reflect the next 
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child u, if u has degree k then we.check if a[k] contains the name of a node. If 
not, we let a[k]:= u. If it already contains the name of a child v, then we make 
one of u and v the child of the other, and update a[k] accordingly. This creates 
a child of degree K+ 1, which must then be checked with a{k +1}, and so forth. 
Since each “collision” of two children with the same degree causes the degree of r 
to decrease, there are fewer than d collisions overall, and so the new children are 
added in O(d) time. In fact, if adding ¢ children causes c collisions, the total work 
of step (d) is proportional to ¢ +c. 

We must still bound the time needed to perform these operations. It is easy to 
sec from Lemma 3.1 that INSERT and DELETEMIN operations take O(logs) 
steps. But a DECREASEKEY operation may take enormous time! For example, 
if the (Fibonacci) tree is a single path from the root, with all nodes but the root 
and its child unsafe, then it takes about n steps to decrease the key of the bottom 
node. Furthermore, if the root has roughly logn children, then adding just a single 
child of the root could take roughly log steps. 

The remedy is a book-keeping trick called amortization of costs. Imagine that 
we maintain the Fibonacci heap as a service. The customer may order any of the 
INSERT, DELETEMIN and DECREASEKEY operations. For each operation, 
we ought to charge him the actual cost, say a cent for each step. But suppose 
that we also require that he pay a deposit of one dollar for each unsafe node that 
is created, and a deposit of 25 cents for each child of the root that is created. 
If either the number of unsafe nodes or the number of children of the root de- 
creases, the appropriate deposit is refunded. With this billing system, an INSERT 
or DELETEMIN operation still costs only O(log) cents, but now we can bound 
the cost (to him) of a DECREASEKEY operation. Let ¢ be the number of nodes 
to be made children of the root; this is at most one larger than the number of 
unsafe nodes becoming safe. Since in step (c), at most one node becomes unsafe, 
the customer then gets a net refund of at least 75¢ — 200 cents. The actual cost of 
step (c) is proportional to ¢, certainly at most S0r. The actual cost of step (d) is 
proportional to ¢ +c, certainly at most 25(¢ +c). However, in step (d), the customer 
also gets a refund of 25c cents. The total cost to the customer of steps (c) and (d) 
is at most 2 dollars: with this billing system, a DECREASEKEY operation costs 
only a constant amount. 

Summarizing, we have shown the following theorem. 


Theorem 3.2. In a Fibonacci heap, performing n INSERT, n DELETEMIN and 
m DECREASEKEY operations takes O(nlogn + m) steps. 


For our original problem we get the following result. 
Theorem 3.3. Dijkstra’s algorithm can be implemented, using Fibonacci heaps, in 
O(nlogn + m) steps. 
3.2. Shortest spanning trees and the UNION-FIND problem 


Many of the data structures discussed in the previous subsection can also be used 
in computing a shortest spanning subtree of a graph. In particular, Fibonacci heaps 
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. can be used to implement Prim’s algorithm to run on a connected graph G with n 
nodes and m edges in O(m + nlogn) time. However, in some cases, we may already 
know the sorted order of the lengths of the edges, or can find this sorted order 
extremely quickly, such as when the lengths are known to be small integers. In these 
cases, we can obtain an even more efficient algorithm by using Kruskal’s algorithm 
implemented with a data structure with surprising combinatorial complications. For 
the following discussion, assume that the sorted order of the edge lengths is known 
in advance. 

Kruskal’s algorithm is very simple: it takes the edges one-by-one in the given 
sorted order, and it adds the next edge to a set T if it does not form a circuit with 
the edges already in 7; otherwise, it disposes of the edge. This seems to take 
steps, except that we must check whether the new edge forms a circuit with T. 
Let uv be the edge considered. To search T for a [u,v]-path would be too time- 
consuming; it would lead to an O(mn) implementation of Kruskal’s algorithm. 

We can do better by maintaining the partition {V),...,V,} of V(G) induced 
by the connected components of the graph (V(G), 7). Each iteration amounts 
to checking whether u and vu belong to the same class; if not, we add uv to T. 
Furthermore, updating the partition is simple: adding uv to T will cause the two 
classes containing u and uv to merge. 

To implement Kruskal’s algorithm efficiently, we must therefore find a good way 
to store a partition of V(G) so that the following two operations can be performed 
efficiently: 

(3.9) FIND. Given an element , return the partition class containing u. 

(3.10) UNION. Given two partition classes, replace them by their union. 

We assume that each partition class has a name, and for our purposes, it will be 
convenient to use an arbitrary element of the class to name the partition class. We 
shall call this element the boss. In a UNION operation, we can keep either one 
of the old bosses as the new boss. Kruskal’s algorithm uses 21 FIND operations 
and n— 1 UNION operations. 

A trivial way to implement a UNION-FIND structure is to maintain a pointer 
for each clement, pointing to the boss of its class. A FIND operation is then trivial; 
it takes only one step. On the other hand, to do a UNION operation we may have 
to re-direct almost 1 pointers, which yields an O(n?) implementation of Kruskal’s 
algorithm. This is unsatisfactory for sparse graphs, and so we must do the UNION 
Operation more economically. 

An almost trivial observation already yields a lot: when merging two classes, we 
re-direct the pointers in the smaller class. We call this rule selection by size. To 
estimate the number of steps needed when using this rule, observe that whenever 
a pointer is re-directed, the size of the class containing it gets at least doubled. 
Hence each pointer is redirected at most logn times. The total number of steps 
spent on UNION operations is therefore O(n logn), and we get an O(m + nlogn) 
implementation of Kruskal’s algorithm. For m = O(n log»), this is optimal. 

‘Yo be able to do better for really sparse graphs (e.g., with a linear number 
-of edges), we use a more sophisticated way to keep track of the boss. We shall 
maintain a rooted tree on each class, with the boss as its root. Each UNION 
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operations then takes only constant time: we make one boss the child of the other. 
But this makes the FIND operation more expensive: we have to walk up in the 
tree to the root, so it may take as much time as the height of the tree. This suggests 
that we should keep the trees short. We can use an analogue to the selection by 
size rule, called selection by height: when merging the two trees, if r; is the root 
of greater height, then it is made the parent of 72, the root of the other tree. This 
increases the height only if the two trees had the same height.’ 

Of course, it is time-consuming to compute the height of the trees at each 
UNION operation, but we can maintain the height A{v] for each node v. This 
is easily updated: it changes only if the union of two trees is performed and the 
roots have A[r;] = Afr]; then 1 is added at the new root. It is easy to verify by 
induction that the number of elements in a tree with root r is at least 2°”. Hence 
A[r] <logn for every r, and the cost of a FIND operation is O(logn). This does 
not yet yield any improvement in the implementation of Kruskal’s algorithm. 

But we can use another idea, called path compression. Suppose that we perform 
a FIND operation which involves traversing a fairly long path vu, ---u,r. Then we 
can traverse this path again, and make each py; the child of the root. This doubles 
the number of steps, but the tree becomes shorter. 

We combine this idea with a variant of selection by height, called selection by 
rank. For each element v, we maintain a number plu] called its rank. The rank 
of each node is initially 0; if two trees with roots r; and 72 are merged, where 
piri] > p[r2], we make 7, a child of r;, and update p by setting 
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A path compression does not change p. The number p(v] is no longer the height of 
v, but it will be an upper bound on the height. Moreover, the number of elements 
in a tree with root v is still ai least 2°!!, We shall need the following generalization 
of this fact, which also follows by induction. 


Lemma 3.4. The number of elements v with plv| = t is at most n/2'. 


For each leaf v, p[v] = 0, p is strictly increasing along any path to the root. Note 
that this guarantees that the height of the tree is at most logn. 

Tarjan (1975) showed that using selection by rank with path compression reduces 
the cost substantially: the average cost of a FIND operation grows only very, very 
slowly. 

Let us recall Ackerman’s function from chapter 25. First, we define a series of 
functions f, : N— N by the double recurrence 


fitn)=2n, f,(O)=2 (622), 
firla) = fifa — 1). 


Thus, f.(n) =2"', f3(n) is roughly a tower of n+1 2’s, and so on. Ackerman’s 
function is the diagonal A(n) = f,(n). This grows faster than any fj, and in fact, 
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faster than any primitive recursive function. The inverse Ackerman function a(n) 
is defined by 


a(n) = min{k: A(k) 2 n}. 


This function grows extremely slowly (e.g., slower than loglogn or log*n). We 
shall also need to introduce inverse functions ¢; of the f;, defined by 


O(n) = max{k: fi(k) <n +2}. 


The definition is made so that for fixed n >3, $;(n) is strictly decreasing as a 


function of i until it reaches 0. Tarjan’s result, in a slightly weakened form, is as 
follows. 


Theorem 3.5. In a rooted forest with n elements, using selection by rank and 
path compression, m FIND operations and at most n—1 UNION operations take 
O((n+ m)a(n)) steps. 


Proof. A UNION operation takes only a constant number of steps. To analyze 
the FIND operation, we first develop a few tools. For each non-root node v, let 
uv’ denote its parent. We define the level of a node v as the least i for which 
6;(p{v]) = :{plv'}); a root is on level 0. (Note that if ;(p{v]) = ;(p[v’]) then 
6;41(p{v]) = 6;.1(p[v’]).) Initially, each node is therefore on level 0. When a node 
becomes a non-root, it then reaches a positive level. Note that from this step on, 
p[v] does not change. The value p[v'] can change in only two ways: v may get a new 
parent (from a path compression) or v has the same parent and yet p{v’] changes 
(from a UNION operation). In either case, p[v'] increases. Hence the level of the 
node v can only increase with time. On the other hand, the level of a node remains 
very small: it is bounded by the least i for which (2) = 4;(0), which is easily seen 
to be at most 2a(n + 3), which is O(a(7)). 

Let v;---u,r be a path that is being compressed, where r is a root. The cost of 
this compression is proportional to k; using the charging analogy again, let us say 
at most k dollars. Consider a node u; (1 < j < k) on the path, and let i be its level. 
If dr Celr)) > &_1(p{v;.1}), then charge one dollar to the node. Otherwise, charge 
one dollar to the customer. 

How much do we charge to the customer per path compression? If he is charged 
for a node vu; at level i, then by the monotonicity of ¢;.; and p, we see that 
hj) (p{vj.41}) = $;-1(p[vj+2}) Sees ¢;_1(pIr)). This implies that Disiy- ++, Ug, are 
at level i — 1 or lower, i.e., v; is the last node at level i on the path. So the customer 
gets charged for at most one node at each level, which is a total charge of O(a(n)). 
For m path compressions, this is a total of O(ma(n)). 

How much charge is left on the nodes? While a node v stays on level i, the 
value ¢;_;(p{v']) increases whenever vu is charged a dollar, and so the total charge 
it accumulates is bounded by the maximum value ¢;_,(p[v‘]) can attain. Note that 
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(plv'|) = 6;(plel) by the definition of level i, and so from the definition of we 
have 


plu’) +3 < fi(dilple')) +1) = fi-r(fi(@iloe'))) = fia(Fi(d(ete))) 
< fi-s(ple] +2), 


and hence, 
$i-1(p\v'}) < piv) + T- 


So the charge accumulated by a node uv while at level i is at most p[v] +1, and 
since p[v] never decreases, we can use the final value in this bound. Adding this 
over all nodes, we can use Lemma 3.4 to show that the total contribution of level 
i to the charges to the nodes is at most 


Di tale <a+ ors < 3n. 
v t 


Since there are O(a(n)) levels, the total charge to the nodes is O(na(n)). 


Corollary 3.6. Kruskal’s algorithm, with pre-sorted edges, can be implemented in 
O(ma(n)) steps. 


For a discussion of other implementations of Kruskal’s algorithm and its rela- 
tives, see Tarjan (1983). 


4. Cryptography and pseudorandom numbers 


Let f: {0,1}* — {0,1}* be a one-to-one function, and assume that this function 
can be evaluated in polynomial time. Can the inverse function be evaluated in 
polynomial time? We do not know the answer, but it is quite likely that the answer 
is in the negative; there are some suspected examples (and some constructions 
that have been disproved). It turns out that such one-way functions, and their 
extensions, could be used in various important applications of complexity theory. 
We examine some of these applications and constructions. 


4.1. Cryptography 


The basic goal of cryptography is to provide a means for private communication in 
the presence of adversaries. Until fairly recently, cryptography has been more of 
an art than a science. No guarantees were given for the security level of the codes, 
and in fact, all the classical codes were eventually broken. Modern cryptography 
has suggested ways to use complexity theory for designing schemes with provable 
levels of security. We will not be able to discuss in detail, or even mention, most of 
the results here. We intend rather to give a flavor of the problems considered and 
the results proved. The interested reader is referred to the surveys by Rivest (1990), 


2028 L. Lovasz et al. 


Goldreich (1988), Goldwasser (1989), or to the special issue of SIAM Journal on 
Computing (April, 1988) on cryptography. 

A traditional solution uses secret-key crypto-systems. Suppose that two parties 
wish to communicate a message M, a string of 0’s and 1’s of length x, and they agree 
in advance on a secret key K, a string of 0’s and 1’s of the same length as M. They 
send the encrypted message C — M () K instead of the real message M, where () 
denotes the componentwise addition over GF(2). This scheme is provably secure, 
in the sense that, assuming the key K is kept secret, an adversary can learn nothing 
about the message by intercepting it: every string of length n is equally likely to 
be the message encoded by the string C. Unfortunately, it is very inconvenient to 
usc this system since before cach transmission the two parties must agree on a new 
secret key. 

In the paper that founded the area of modern cryptography, Diffie and Hell- 
man (1976) suggested the idea of public-key crypto-systems. They proposed that 
a transmission might be sufficiently secure if the task of decryption is computa- 
tionally infeasible, rather than impossible in an information theoretic sense. Such 
a weaker assumption might make it possible to securely communicate without 
agreeing on a new secret key every time, and also to achieve a variety of tasks 
previously considered impossible. 

We illustrate the idea of security through intractability by a simple example. 
Assume that you have a bank account that can be accessed electronically by a 
password. If the password is long enough, and you keep it safely, then this is secure 
enough. Except that the programmer of the computer may inspect the memory, 
learn your password, and then access your account. It would seem that there is no 
protection against this: the computer has to remember the password, and a good 
hacker can learn anything stored in the computer. 

But the computer does not have to know your password! When you open your 
account you first generate a Hamiltonian circuit H on, say 10000 nodes, and then 
add edges arbitrarily to obtain a graph G. The computer of the bank will only 
store the graph G while your password is the code of a Hamiltonian circuit H. 
When you punch in your password the computer checks whether it is the code of 
a Hamiltonian circuit in G, and lets you access the account if it is. For you, the 
Hamiltonian circuit functions like a password in the usual sense. 

But what about the programmer? He can learn the graph G; but to access your 
account, he should specify a Hamiltonian circuit of G, which is a computationally 
intr’ table task. This intractability protects your account! 

‘ucial point here is that given a Hamiltonian circuit, it is easy to construct 
‘taining it, but given a graph, it can be quite hard to find a Hamiltonian 
‘nfortunately, it is easy to find a Hamiltonian circuit in most graphs, 
vay known to produce graphs for which finding the Hamiltonian 
“9 be difficult. 
8 motivate the following definition. A one-way function is a 
* 1}* — {0, 1}* such that f can be computed in polynomial 
‘e algorithm can invert f on even a polynomial fraction 
“viven a one-way function f, we can choose an x € 
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{0,1}* as our password and store the value f(x) in the computer. The length of 
the word chosen is the security parameter of the scheme. This is indeed done 
in practice. (The above scheme with Hamiltonian circuits leads to a somewhat 
weaker notion: a “one-way relation”, but for most applications, we need a one- 
way function.) 

The password scheme above is just a simple example of what can be achieved 
by a public-key crypto-system. This can have any number of participants. The 
participants agree on an encryption function ®, a decryption function W, and a 
security parameter n. Messages to be sent are divided into pieces of length n. The 
system functions as follows: 

e Each participant should randomly choose a public encryption key F and a 
corresponding secret decryption key D depending on the security parameter n. 
A directory of the public keys is published. 

e There must be a deterministic polynomial-time algorithm that, given a mes- 
sage M of length n and an encryption key E, produces the encrypted message 
@(E,M). 

e Similarly, there must exist a deterministic polynomial-time algorithm that, given 
a message M of length n and a decryption key D, produces the decrypted 
message ¥(D,M). 

e It is required that for every message M, W(D, ®(E,M)) = M. 

And the crucial security requirement: 

e One cannot efficiently compute ¥(D,M), without knowing the secret key D. 
More precisely, for every constant c and sufficientlylarge n, the probability that 
a (randomized) polynomial-time algorithm using the public key E but not the 
private key D can decrypt a randomly chosen message of length 7 is less than 
nn’, 

When a user named Bob wants to send a message M to another user named AI- 
ice, he looks up Alice’s public-key FE, in this directory, and sends her the encrypted 
message P(E ,,M). By the assumptions, only Alice can decrypt this message using 
her private key Da. 

The question is: how do we find such encryption and decryption functions? The 
basic ingredient of such a system is a “trapdoor” function. The encryption function 
@(F,-) for a fixed participant must be easy to compute but hard to invert, i.e., a 
one-way function; but in order for a one-way function to be useful in the above 
scheme, we need a further feature: it has to be a trapdoor function, which is a 
2-variable function like ® in the scheme above: ®: {0,1}" x {0,1}" — {0,1}" such 
that for every F € {0,1}", f(F,-) is one-to-one and easily computable, but its 
inverse is difficult to compute, unless one knows the “key” D belonging to E. 

Given the present state of complexity theory, there is little hope to prove that 
any particular function is one-way or trapdoor (or even that one-way functions ex- 
ist), or that a public-key crypto-system satisfies the last requirement above. Note 
that the existence of a one-way function would imply that ? 4 NY. Therefore, 
a more realistic hope would be to prove the security of a crypto-system based 
on the assumption that ? 4 VY. However, this seems to be quite difficult for 
two reasons. Complexity theory is mainly concerned with worst-case analysis. For 
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the purpose of cryptography, average-case analysis, and a corresponding notion 
of completeness (such as the one suggested by Levin 1986) would be more ap- 
propriate. Furthermore, one-way functions lead to languages in WY which have 
a unique “witness”, whereas the nondeterministic algorithms for “&-complete 
problems have several accepting paths for most instances. 

Over the last ten years, several public-key crypto-systems have been suggested 
and analyzed. Some have been proven secure, based on assumptions about the 
computational difficulty of a particular problem (e.g., factoring integers), or on 
the more general assumption that one-way functions exist. 

Rivest et al. (1978) were the first to suggest such a scheme. Their scheme is 
based on the assumed difficulty of factoring integers. Each user of this scheme has 
to select a number n that is the product of two random primes. Random primes can 
be selected using a randomized primality testing algorithm, since every (logm)th 
number is expected to be prime. The public key consists of a pair of integers (n, e) 
such that e is relative prime to @(#), where @(n) denotes the number of integers 
less than n relatively prime to n. A message M is encrypted by C = M¢ (mod n). 
The private key consists of integers (#,d), where d-e = 1 (mod ¢(n)), and de- 
cryption is done by computing C“ = M (mod n). Given the prime factorization 
of a, one can find the required private key d in polynomial time, but the task of 
finding the appropriate d given only n and ¢ is equivalent to factoring. 

Rabin (1979) suggested a variation on this scheme in which any algorithm for 
decryption, rather than any algorithm for finding the decryption key, can be used 
to factor an integer. The idea is to encrypt a message M by C = M2 (mod n). 
(A slight technical problem that has to be overcome before turning this into an 
encryption scheme is that squaring modulo a product of two primes is not a one-to- 
one function, but rather is four-to-one.) An algorithm that can extract square roots 
modulo n can be used to factor: choose a random integer x and let the algorithm 
find a square root y of the integer (x? mod 7); with probability 5, the greatest 
common divisor of x — y and n is one of the two primes used to create n. 

Unfortunately, the problem of factoring an integer is not known to be WP-hard; 
it would be conceptually appealing to suggest schemes that are based on NP- 
complete problems. Merkle and Hellman (1978) and since then several others, 
have suggested schemes based on the subset sum problem. The public key of such 
systems is an integer vector a = (a1,...,a,). A message M, that is a 0-1 vector of 
length n, is encrypted by the inner product C = (a@-M). One problem with this 
scheme is that there does not appear to be a private key that can make the task of 
decryption easier for the intended receiver. Crypto-systems based on this idea have 
built some additional trapdoor information into the structure of the vector a, and so 
the decryption problem is based on a restricted variant of the subset sum problem, 
which is not known to be WY-hard. Furthermore, randomly chosen subset sum 
problems can be easy to solve. In an innovative paper, Shamir (1982) used Lenstra’s 
integer programming algorithm to break the Merklc—Ilellman scheme; that is, he 
gave a polynomial-time decryption algorithm that does not need the secret key. 
Since then, several other subset sum-based schemes have been broken by clever 
use of the LLL basis reduction algorithm (see chapter 19). 
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The schemes mentioned so far have encrypted messages of length , for some 
given security parameter n. An apparent problem with this approach is that even 
if we prove that no polynomial-time algorithm can decrypt the messages, it still 
might be possible to deduce some relevant information about the message in poly- 
nomial time. To formalize what it means to exclude this, Goldwasser and Micali 
(1984) suggested the following framework for a randomized encryption procedure 
and its security. Suppose that there is a function B: {0,1} — {0,1} and a ran- 
domized polynomial-time algorithm that, for any image B(x), produces a value 
y such that B(y) = B(x), and y is randomly distributed among all such values; 
an m-bit message can be encrypted by running this algorithm. Intuitively, this en- 
coding is secure if, for any two messages M, and M2 and some pair of codes for 
them, £, and £), (i.e., B(E;) = Mj, i =1,2) it is “impossible” to gain any advan- 
tage in guessing which message corresponds to each encryption. More precisely, 
the randomized encryption based on B is secure if, for each pair (M,,M2) any 
randomized polynomial-time algorithm that is used to guess, given (M,, M2) and 
an unknown random ordering of £; and £2, which value comes from which ar- 
gument, the probability that the algorithm answers correctly is } + O(n~) for all 
but ann‘ fraction of the encryptions, for every c > 0. Note that if the encryption 
algorithm is a deterministic polynomial-time algorithm, then it is not secure, since 
one could simply apply the encryption algorithm to M, and M> to determine the 
correspondence. 

Goldwasser and Micali proposed a way to achieve this level of security by en- 
crypting messages bit by bit. How does one encode a single bit? A natural so- 
lution is to append a number of random bits to the one bit of information, and 
encrypt the resulting longer string. While this was later shown to be an effective 
approach, Goldwasser and Micali proposed a different randomized bit-encryption 
scheme based on a function B: {0,1}”" — {0,1} and proved its security, based on 
a number-theoretic assumption, namely on the assumed difficulty of the quadratic 
residuosity problem. 

An integer x is a quadratic residue modulo n if x = y* (mod n) for some integer 
y. The quadratic residuosity problem is to decide for given integers n and x whether 
xX is a quadratic residue modulo n. If n is a prime then it is easy to recognize if x 
is a quadratic residue modulo n, ¢.g., by computing x~/? modulo n: this is 0 or 
1 if and only if x is a quadratic residue. 

The quadratic residuosity problem for composite moduli is more difficult. One 
has to be careful, however, since there is a polynomially checkable necessary condi- 


tion that could help in certain cases. To formulate this, define the Legendre symbol 
for any prime n by 


, 1, ifm |x and x is a quadratic residue mod n, 
(=) = ¢-1, ifm }x and x is not a quadratic residue mod n, 
be 0, if nlx. 


We have seen above that it is easy to compute the Legendre symbol (where n 
is a prime). The Jacobi symbol is a generalization of the Legendre symbol to 
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composite nm, but not in the obvious way: if n = p,--- px, (where the p; are not 
necessarily distinct primes), then 


G =m) 
" i=l Pi 2 
It cannot be seen from this definition, but the Jacobi symbol can be computed in 
polynomial time for any n. 

Now if ” and x are coprime integers then (£) = 1 is a necessary condition for 
x to be a quadratic residue modulo n, but it is not sufficient: if m is a product of 
two primes, then exactly half of the residue classes x with () = 1 are quadratic 
residues. The Goldwasser—Micali encryption scheme is based on the assumption 
that there is no efficient way to obtain further information on the quadratic resid- 
uosity of x (mod 1). It works as follows: a public key consists of an integer n that 
is the product of two large primes, and a quadratic non-residue y with () =1. 
The bit 0 is encrypted by a random quadratic residue r? (mod n), the bit 1 is 
encrypted by a random quadratic non-residue of the form r?y (mod 1). The task 
of distinguishing encryptions of 0’s from encryptions of 1’s is exactly the quadratic 
residuosity problem. Decryption is easy for the intended receiver, who knows the 
prime factorization of n. 

Yao (1982) has extended this result by proving that a secure randomized bit- 
encryption scheme exists if a one-way function exists; in fact, from every one- 
way function one can extract a “secure bit”. The following simple construction 
was given by Goldreich and Levin (1989). Let f:{0,1}* — {0,1}* be a length- 
preserving one-way function, where a function is length-preserving if it maps n-bit 
strings into n-bit strings, for all 2. Define a Boolean function by B(x) = f-1(x’)- x”, 
where, if n= |x|, then x’ and x” are the first and last [n/2| bits of x, and “.” 
denotes inner product. (If we also want to decode this bit, we take a trapdoor 
function for f.) 

Diffie and Hellman noticed that under a further assumption, a public-key crypto- 
system can also be used to solve the signature problem, where each participant 
wants a way to electronically sign its messages so that no one else can forge 
it, in the sense that cach recipicnt can verify that the message must have been 
signed by the claimed sender. The assumption, which is not very restrictive, is 
that W(D,-) is one-to-one, ie., (EF, ¥(D,M)) = M for each message M. In this 
case, the system can be used for signatures in the following way. When Bob sends 
a message M to Alice he can use his private decryption key Dg to append the 
“signature” (Dx, M) after the message M. Given such a signature, Alice can use 
Bob’s public key Eg to convince herself that the message came from Bob, exactly 
as she received it. The Rivest, Shamir and Adleman scheme has this additional 
property, and therefore can be used for signatures as well. 


4.2. Pseudo-random numbers 


When random numbers are used in algorithms and crypto-systems, it is essential 
that the random bits used are unbiased and independent. The speed or the reliabil- 
ity of-the algorithm, and the security of the crypto-system depend on the quality 
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of the random numbers used. Natural sources of randomness, such as coins or 
noise diodes, are fairly slow in generating random bits. On the other hand, for 
both of these applications, truly random bits could be replaced by any sequence of 
bits that no polynomial-time algorithm can distinguish from truly random bits. A 
pseudo-random number generator takes a seed x, a truly random string of length 
n, and expands it to a pseudo-random string y of length n* for some constant k. 
A pseudo-random number generator can be subjected to certain statistical tests. It 
passes the next bit test if after seeing the first i bits of its output y for some i < nk 
no polynomial-time algorithm can predict the next bit with probability more than 
4 +n for any constant c. 

Most computers have built-in pseudo-random number generators; one of the 
simplest ones is the linear congruential generator (where the seed consists of in- 
tegers a, b, m and xy, and the pseudo-random numbers are generated by the 
Tecurrence x;,; = ax; +b (mod m)). This is easily shown to fail the next bit test. 
Other, more sophisticated pseudo-random number generators can also be shown 
to output inappropriate sequences, by clever use of the LLL basis reduction algo- 
rithm. An example is the binary expansion of algebraic numbers (where the seed 
is the polynomial defining the algebraic number). 

The first provably secure pseudo-random bit generator was developed by Blum 
and Micali (1984). They proved that pseudo-random bit generators exist, based on 
the following paradigm: there exist a polynomial-time computable permutation F 
of the set {0,1}" and a function B: {0,1}" — {0,1}, such that B(x) yields a secure 
bit (as discussed above), but given F~'(x), B(x) can be computed in polynomial 
time. Such an F is necessarily a one-way function; B is called the “hard core” of 
F. The Goldreich-Levin construction of a secure bit can be used to show that such 
a pair of functions exists if a one-way function exists, by taking F(x) = f(x’) -x" 
for some length-preserving one-way function f. 

The Blum-Micali pseudo-random number generator produces the sequence 
bo,.--, bx, defined by b; = B(x,_;), and x;,; = F(x;), where the random seed is used 
to select the functions used and the initial vector xy. The defined pseudo-random 
number generator can be proved to pass the next bit test. (Informally, suppose 
that we have an algorithm that can predict bj,,; = B(x,-;-1), given bo,...,b;; we 
will use this to contradict the fact that B yields a secure bit. Note that given just 
Xg-j-, We Can compute x, j,-.-,X~ 4, and use these values, F(x, .{),---, F-'(x,), 
to compute the bits bo,...,5; in polynomial time; hence we can use the assumed 
procedure to predict 6;,,, which is impossible.) 

One might wonder whether certain pseudo-random number generators pass sta- 
tistical tests other than the next bit test. However, Yao (1982) proved that if a 
pseudo-random number generator passes the next bit test then it passes any statis- 
tical test, ie, no randomized polynomial-time algorithm can distinguish the gen- 
erated pseudo-random numbers from truly random numbers. 


4.3. Zero-knowledge proofs 


Let us return to our example with the bank account access. The programmer of 
the computer of the bank may be tricky and store your password after you have 
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used it once. Can you avoid using it at all and only prove to your bank that you 
have a password (i.c., know a Hamiltonian circuit in the graph G) without giving 
any help to find it (or mimicking you in any other way)? 

This question also comes up in some of the above cryptographic applications: it 
might be useful to be able to convince someone that a number is the product of 
two primes, without telling the two primes themselves. This is impossible in the 
classical sense of proofs, but interactive proof systems, discussed in chapter 29, 
make it possible. In fact, this was one of the motivating examples for Goldwasser 
et al. (1989) when developing their notion of interactive proof systems. Informally, 
an intcractive proof of a statement is said to be a zcro-knowledge proof if the 
verifier cannot learn anything from the proof except the validity of the statement. 

Before formalizing the notion of zero-knowledge proofs, let us describe a solu- 
tion to the bank problem (due essentially to M. Blum). To make it more transpat- 
ent, we imagine another setup: suppose that you are giving a talk on Hamiltonian 
graphs, and you show your audience a Hamiltonian graph G. For didactical pur- 
poses, you want to convince them that the graph G has a Hamiltonian circuit 
without showing them the circuit itself. This seemingly impossible task can be 
accomplished using an overhead projector. You prepare two transparencies: both 
show the same set V(G) of nodes, in some random position; the first shows the 
edges of the Hamiltonian. circuit C in G, the second, the remaining edges of G. 
On this second transparency, the labels of the nodes are also shown, but not on 
the first! You place both transparencies on the projector and cover them with a 
piece of paper, then switch on the projector and let the readers choose whether 
the top sheet or the top two sheets should be removed. 

If only the top sheet is removed, the audience sees the graph G, politely labelled 
so that the audience can verify that it is indeed the graph G shown at the beginning. 
If both upper sheets are removed, the audience sees a Hamiltonian circuit on 
}V(G)| randomly placed nodes in the plane. In either case, no information is given 
on how the Hamiltonian circuit lies in the graph G. (The only information the 
audience gets is that they see what they expect.) 

On the other hand, if you want to cheat and show a graph G that is not Hamil- 
tonian, then either your bottom transparency does not show a Hamiltonian circuit 
with the right number of nodes, or the two transparencies together do not show 
the right graph. So there is a chance of 5 that you get caught! If you repeat this 
100 times (a bit boring for a talk, but easily done on paper), and you do not get 
caught then the audience can be reasonably certain that G is Hamiltonian: your 
chance of getting away with a non-Hamiltonian graph is one in 2', 

To make the above protocol precise, we have to get rid of the physical devices 
like projectors and transparencies; but this can be done using the methods of cryp- 
tography discussed above. The basic cryptographic tool needed for this is a secure 
bit-encryption scheme. You must be able to encrypt a bit, so that the audience 
has no chance to figure out on his own what the bit is, but later you can prove 
which bit was encrypted. To convince the audience that G has a Hamiltonian cir- 
cuit, you choose a random permutation P, and use this permutation to obtain a 
random isomorphic copy G’ of the graph G. You encode the permutation, and the 
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n(n — 1)/2 bits representing the adjacency matrix of the graph G’. The audience 
can choose either to ask for a proof that the encoded graph G’ is isomorphic to 
the original, or to ask for a proof that G’ has a Hamiltonian circuit. In the first 
case, you decrypt every encrypted bit, thereby providing the permutation P and 
the graph G’. In the second case, you decrypt only the bits corresponding to edges 
participating in the Hamiltonian circuit C. 

There are several ways to formalize the notion of zero-knowledge proofs. The 
one we Shall use is computational zero knowledge. We say that an interactive pro- 
tocol is a zero-knowledge protocol if the verifier can generate, in randomized poly- 
nomial time, a sequence of communication whose distribution is indistinguishable 
in polynomial time from the distribution of the true transcript of the conversation. 

In our example above, the audience could predict that if it chooses to see both 
transparencies together, it will see the given graph with nodes randomly drawn in 
the plane; while if it chooses to see the bottom transparcncy, then it will see a 
circuit with the right number of (unlabelled) nodes, again randomly drawn in the 
plane. 

In contrast, consider the example of the interactive proof for the graph non- 
isomorphism problem (chapter 29, section 2). At first sight, this seems to be a 
zero-knowledge protocol, since, so long as the verifier does not deviate from 
the protocol, he always knows the prover’s next move, and hence could gener- 
ate the conversation himself. There are zero-knowledge protocols for the graph 
non-isomorphism problem, but this protocol is not, in fact, zero-knowledge, since 
the verifier can use it to test if a third graph is isomorphic to one of the two in the 
input (i.e., the verifier can gain extra information by deviating from the protocol). 
Goldreich et al. (1986) and, subsequently but independently (under a somewhat 
stronger assumption), Brassard et al. (1988) proved the following result. 


Theorem 4.1. /f one-way functions exist, then every language in NP has a zero- 
knowledge interactive proof. 


Proof. To prove that all languages in WP have zero-knowledge proofs, one merely 
has to provide a zero-knowledge proof for a single WP-complete problem. We 
have sketched such a protocol for the Hamiltonian circuit problem. 
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It is a beautiful feature of mathematics that quite often results and methods from 
one branch can be applied to solve problems in a seemingly distant branch in a 
successful and often surprising way. In fact, techniques from classical fields like 
analysis and linear algebra belong to the toolbox of every mathematician. In virtu- 
ally no field would one be surprised to see a generating function or the computation 
of the eigenvalues of a matrix. This Handbook contains several chapters showing 
how algebra, topology, probability theory etc. can be applied to combinatorial 
problems in a deep way. 

The application of combinatorial methods in other areas is not so common (this 
may be due to the relative youth of the subject). This chapter contains a collection 
of examples showing the application of a variety of combinatorial ideas to other 
areas. 

In some cases, it is a specific combinatorial theorem that is used. For example, 
in section 1, theorems from extremal combinatorics are applied in the theory of 
finite dimensional Banach spaces, to prove embeddability and ncar-embeddability 
results. In section 3, the Marriage theorem is applied to give a very simple con- 
structive proof for the existence of the Haar measure on compact topological 
groups. 

Elsewhere, it is a combinatorial structure whose presence should be recognized 
to shed light on a seemingly hopelessly complicated situation. Section 4 illustrates 
the use of graphs and coherent configurations in the study of permutation groups. 
Finding the combinatorial structure which retains just enough of the group struc- 
ture is the key point. Section 5 describes a method of analysing a commutative 
ring (like, for example, the coordinate ring of a Grassmann or Schubert varicty) by 
writing it as an algebra with straightening law over a finite poset, and then utilizing 
poset combinatorics to identify algebraic structure of the ring and to prove funda- 
mental algebraic properties of some classical projective varieties. Section 6 shows 
how the algebraic and topological structure of a complex hyperplane arrangement 
is determined by the intersection lattice of the arrangement (i.e., by a matroid). 

And in some cases, problems with a very classical flavour turn out to be so 
closely related to combinatorial problems that even the direction of the interac- 
tion is difficult to tell. In section 2 we see how embeddings of finite metric spaces 
in classical! Banach spaces are related to “hypermetric inequalities”, to multicom- 
modity flows, and to lattice-point-free ellipsoids. In section 7 we sketch how a 
recent topological invariant of knots, the Jones polynomial, can be derived from 
the Tutte polynomial of an appropriate graph, and how this connection can be 
used to settle a long-standing conjecture in the theory of knots. 


1. Sections of finite dimensional Banach spaces 


{tis well known that if N; and N> are two norms on a finite dimensional real linear 
space R" then there exist constants c;,c2 > 0 such that 


eiNi (x) < N2(x) < e2Ni (x) 
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for all vectors x. These constants depend on the norms; but if we allow a linear 
transformation, then any given norm Nj can be transformed into one which is more 
similar to another given norm Np. In fact, let K = K(N,) = {x € R": Ny (x) < 1} be 
the unit ball of the first norm; this is a convex body centrally symmetric with 
respect to the origin. It can be shown that there exists an ellipsoid E inscribed in 
K with maximum volume and it is unique; F is called the inscribed L6wner—-John 
ellipsoid of K (cf. chapters 19 and 30, and also Grotschel et al. £988). Perhaps 
the most important property of E is that if we blow up E by a factor of /n then 
the resulting ellipsoid will contain K (for this result, the central symmetry of K is 
needed; for general convex bodies, /n has to be replaced by 1). 

If we apply a linear transformation that maps E onto the unit ball, then N, will 
be transformed into a norm N; satisfying 


! t 
val < Ni(x) < [lll 


(here ||x|| denotes the Euclidean norm). Applying this argument to the other norm 


as well, we get that N, can be transformed by a linear transformation into a norm 
Ny’ such that 


Ni'(x) < No(x) <nNI"(x) 


for all x € R". 

One can achieve a better approximation if one is allowed to restrict the norms to 
appropriate subspaces; this way onc obtains theorems with the flavor of Ramscy’s 
theorem (see chapter 25). A classical result of Dvoretzky (1959) asserts that for 
every positive integer k and every « > there exists ann =n(k,&) > 0 such that for 
every norm N on R" there exists a constant C > 0 and a subspace V of R" with 
dimension k such that for every x © V, 


C- |x] < N(x) < (+ e)C- |x. 


The following result of Figiel et al. (1977) illustrates how to obtain finer measures 
of the approximation of an arbitrary norm by the Euclidean norm on a subspace. 
Let M) = M,(N) denote the square root of the average of N(x)? over all vectors 
with unit Euclidean length. Then for every ¢ > 0 there exists a 5 > 0 such that for 
every norm N on R" such that N(x) < |\x|| there exists a subspace V with dimension 
at least 5nM? such that N(x) > (1 — €)||x|| for every x € V. 

The proof of such results on the @ norm is geometric. Our main topic will be 
two results with a similar flavour concerning the €;-norm ||x|],; = 5; ||x;{| and the 
£,..-norm {[x||.. = max; ||x;{]. Both proofs use a number of combinatorial tools. 

We introduce some quantities analogous to the quantity M,(N) defined above. 
Let N be an arbitrary norm on R". Let M; = M,(N) denote the average of N(x) 
over all +1-vectors, and M. = M..(N), the maximum of N(x) over all +1 vec- 
tors. If we have N(e;) = 1 for all i then 1 << M)(N) <n, with equality in the first 
inequality iff N = |{-|,. and equality in the second inequality iff N = ||-||,. 
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Note that N(x) is a convex function and hence we could take the maximum over 
the whole cube spanned by {—1,1}" (the unit ball of ||-|].) instead. So we have 


N(x) < Malla loo 


for all x. 

We can also formulate an opposite inequality. Assume that N(e;) = 1 for i= 
1,...,n. Let b!x =1 be the equation of a supporting hyperplane to K at the 
point e; (so b; is in the polar body). Set b; = (bj1,..., Din)"; then bj, = 1. Note that 
M,. > max; yi |b;;|. Let 


Mo = min ae bx = min{ | Sul} 


x;=1 a] 
(this may be negative, which will be a trivial case). For any vector x € R", we have 
N(x) = max{b'x: b € K*}. 


Let i be the index with |x;| maximum, and assume that, say, x; > 0. Then 


N(x) > bi x = So xybij > Xi So ix . [bi\ > Xj (1 = > a) = Mallx!loo- 

i ift i#i 
Our first topic is a theorem of Milman (1982). Let N be a norm in R” whose unit 
ball is a polyhedron (this is so far not a very severe restriction since any norm can 
be approximated arbitrarily well by a polyhedral norm). Then NW can be written as 
N(x) = ||Ax||.0 with an appropriate matrix A: R” — R”". Now assume that every 


entry of A is 1 or —1. Then A has at most 2” distinct rows. Observe that if the 
number of rows is exactly 2” then N(x) = {|x|}. 


Theorem 1.1. /f N(x) = ||Ax||... where A is a +1 matrix with m (distinct) rows and 
n columns, then there exists a linear subspace V CR" with dimV > [Inm/Inn| 
such that the restriction of N to V is isometric to an €, norm. 


Proof. Let k = [Inm/InnJ; we may assume that k > 1. Then m > n* > L+n+(§)+ 
-+++(Z), and hence by the Sauer-Shelah theorem (chapter 24, Theorem 4.8), we 
can choose k columns of A (say, the first kK columns) such that every +1-vector 
arises as the restriction of some row of A to these columns. Let V be the subspace 


of R" consisting of those vectors whose last n — k coordinates are 0. Then for every 
ve, 


N(v) = ||Av|}oo = max 


k k 
So ajv;| = >> |vj| = loll, 
j=l j=l 


since the maximum is attained for the row of A for which a;; =signu;. O 


A corollary of this theorem uses the quantity M, to bound the dimension of V: 
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Corollary 1.2. If N(x) = ||Ax||.o, where A is a +1-matrix, then there exists a linear 
subspace V CR" with dim V > M?/(2nInn) such that the restriction of N to V is 
isometric to an €, norm. 


Proof. Assume that A has m distinct rows; trivially, 7: >n. We show that M, < 
V2nInm,; this then implies that the subspace V in the assertion of Theorem 1.1 
satisfies dim V > M?/(2nInn). 

We want to bound, for a random +1 vector w, the expectation of ||Awl{.o. Let 
Q|,..-,4m be the rows of A. Then a;w is the sum of n independent random variables 


assuming | and —1 with equal probability, and hence by Chernoff’s inequality (see 
chapter 33), 


Prob (Iam > Vain <ertinm + 


m 


Hence 


’ 


m 


Prob (max la;w| > vain <m-e 2iem a 


and so the expectation of ||Aw/||.. is at most (1 —1/m)/nInm+n/m < V/2nInm. 
(This argument is essentially the same as the probabilistic upper bound on the 
discrepancy of a hypergraph with n vertices and m edges; cf. chapter 26.) 


The next result whose proof uses combinatorial tools is due to Alon and Milman 
(1983). 


Theorem 1.3. Let N be any norm on R" such that N(e;) = 1 for all i. 


(i) There exists a subspace V spanned by |V/n} basis vectors e; such that 
M. (Ny) < 8M\(N); in other words we have, for all x € V, 


N(x) < 8Mi|[x[l-0. 


(ii) For every «> 0 there exists a subspace V spanned by |en/(8M..)| basis 
vectors e; such that Mo(N|y) = 1—, and hence we have for all x € V, 


(1 — €)}}xIloo < N(x). 


Combining (i) with (ii) we obtain that there exists a subspace V spanned by 
{/n/(i28M;)}| basis vectors e; such that for all x EV, 


5lxlloo < N(x) < BM yIllho0. 


Proof. (i) This part of the proof uses a combinatorial lemma which is analogous 
to the Sauer-Shelah Theorem. Let h(,k) denote the smallest integer for which 
the following holds: whenever R C {—1,1}" with |R| > A(m,k), then there exists a 
subset J C {1,...,} with |/[ = such that for every q € {—1,1}* there exist two 
vectors u,w € R such that u; = w; =q; if i € 7 and u; = —w; if i ¢ 7. 
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(itk--1 )/2 + 

S; Os ifn +k is odd, 
Ps i=0 = 

h(n, k) = (a+k-~2)/2 


n—} n ; ; 
cane S: ("). ifn+k is even. 


i=0 


For a proof of this lemma, which uses the “down-shift” technique (cf. chapter 
24), we refer to the original paper. 

Now let R denote the set of vectors w € {—1,1}" such that N(w) < 8M,. Let k 
be the least integer such that k > /n and k =n — | (mod 2). Then 


M,=2" SY) N(w) <2-"(2"—|RI)8M)), 


we{-1,1}* 


whence (if n is large enough) 


(nsk~1)/2 x 
7 
IR|>92"> > fe 


1=0 


Hence by Lemma 1.4, there exists a subset J C {1,...,n} such that |/| = & and for 
every q € {—1,1}5 there exist two vectors u,w € R such that u; = w; =; ifiel 
and u; = —w;, if i ¢ 7. We claim that the subspace V spanned by {e;: i € 1} satisfies 
the requirement in (i). 

By our discussion of M.,,, it suffices to show that for every +1-vector v spanned 
by e; (i c I) we have N(q) < 8M. To this end, let q be the restriction of v to the 
coordinates in / and consider the vectors u, w € R as above. Then v = (1/2)(u+w) 
and hence 


N(v) < $(N(u) + N(w)) < 8M. 


(ii) The proof of this second part also uses a lemma with a combinatorial flavor 
(Johnson and Schechtman 1981): 


Lemma 1.5. Let A be an n x n matrix with non-negative entries and with Q’s in the 
diagonal. Assume that each row-sum of A is at most D. Then for every k 21, A 
has a k x k principal submatrix A' such that each row-sum of A' is at most 8kD/n. 


This lemma can be cast in graph-theoretic terms: if G is a directed graph with 
maximum outdegree D and k > 1, then G contains an induced subgraph on k nodes 
with maximum outdegree at most 8kD/n. For undirected graphs, this follows from 
a theorem of Lovdsz asserting that if G is an undirected graph with maximum 
degree D and t >| then the nodes of G can be partitioned into [(D —1)/t] classes 
such that each class induces a subgraph with maximum degree at most t (cf. chapter 
4, Theorem 4.2). The directed version follows by deleting the nodes with indegree 
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larger than 2D (this means at most half of the nodes), and then applying the 
undirected version to the remaining graph, disregarding the orientation. 
Return to the proof of Theorem 1.3. Let k = |en/(8M..)|. With the notation 


introduced at the beginning of the section, consider the matrix P = (pj) defined 
by 


ane 1 if i = j, 
Pu \ \bil, if ij. 
Then every row-sum of P is bounded by M.: 


So pis < do |bij| < Mo. 
j 7 


Hence by Lemma 1.4, A has a k x k principal submatrix in which every row-sum 
is at most ¢. Let, say, the upper left k x k submatrix of P have this property, and 
let V be the subspace spanned by e;,...,e,. Then 


Mo(Niv) = min(1 — 2» Im!) BI =e, 


ft 


and hence for every x € V, 
N(x) 2 (1 €)|lx|foo- D 


We remark that by a rather standard “partitioning” argument one can improve the 


upper bound in (i) at the cost of decreasing the dimension of the subspace, and 
prove the following: 


Corollary 1.6. For every norm N on R" and every & > 0 there exists a linear em- 
bedding A: R* — R" where k = n®!4™)) such that for every x € RR, 


II loo < N(Ax) < (1+ &)[lx[fou- 


2. Embeddings of finite metric spaces and hypermetric inequalities 
Let us start by recalling the following classical result of Cayley: 


Theorem 2.1. A symmetric matrix D = (dj;)?_, ‘= , iy the matrix of mutual distances 
of n (not necessarily distinct) points in R" (or in the Hilbert space) if and only if 
D 20, dj =0 for all i, and the matrix (aj; yar is positive semidefinite, where 
2442 2 
Qi; = di, + di, = di;. 


The proof of this theorem is rather straightforward linear algebra. Another way 
to state this condition is that the matrix D®) obtained by squaring each entry of 
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D is of negative type, which means that for every vector x € R" with >>; x; = 0, we 


have 
Soo aixix; <0. 
ij 


We may ask analogous questions about embeddability in other important Ba- 
nach spaces, such as the L; space. The answer to such questions often leads to 
combinatorial considerations which tie these issues to polyhedral combinatorics, 
lattice geometry, and flow theory. 

So let D be ann x n matrix and assume that D is the matrix of mutual distances 
of n points in L,. There are some obvious conditions that D has to satisfy: 


dij = dii, (2.1) 
di; > 0, (2.2) 
d;, = 0, (2.3) 


and of course the triangle inequality: 
dij + dix 2 diz. (2.4) 


A matrix satisfying these conditions is called a metric. (To be precise, it should 
be called a semimetric, since in the definition of metric it is usually assumed that 
distinct points have positive distance; but we allow that two rows of the matrix be 
represented by the same point, and so it is more convenient to allow d;; = 0. Also 
note that (2.1), (2.3) and (2.4) imply (2.2).) All metrics for a fixed n form a convex 
cone M,,, which we call the metric cone. Since M,, is defined by a finite number of 
linear inequalities, it is a polyhedral cone. 

Now consider the fact that D is a metric that is L,-embeddable, i.e., it can be 
represented by a measurable space (2, 4, w) with finite measure and by integrable 
functions f),...,f.:2— R so that 


ae / li —Fildye. 
a 


It is not difficult to see that these functions can be chosen 0-1-valued and (2, #, 1) 
can be chosen so that it consists of a finite number of atoms. So if D is L,- 
embeddable then there always exists a representation in terms of a finite set 2, a 
weighting 4: 2 — R,, and subsets A; C 2 such that 


dj; = {Aj AA)). 


It is easy to verify from this interpretation that for each fixed n, L,-embeddable 
metrics form a convex cone: if D is represented by (Q,p,A1,..-,A,) and D’ is 
represented by (2', 1’, Aj,...,A/,) (where we may assume that 2 1’ = 9), then 
AD is represented by (Q,Ap, Ay,..-, An) and D + D’ is represented by (QUQ’, wU 
p', AyUA),...,4,UA}). This cone is called the Hamming cone and is denoted 
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by H,,. (The fact that L,;-embeddable metrics form a convex cone is not entirely 
obvious: this is not true, e.g., for L2-embeddable metrics.) 

It should also be noted that every L2-embeddable matrix is also L;-embeddable. 
In fact, assume that D can be represented by the Euclidean distances in a set 
{v1,..-,Un} C R*. Let 2 be the set of halfspaces in R% separating at least one 
pair {v;,vj}. There exists a translation- and rotation-invariant measure p on the 
halfspaces in R%, and it is easy to see that (2) is finite. Let A; denote the set 
of halfspaces in Q containing u;; then »(A;AA,) is proportional to the Euclidean 
distance of vu; and u;. 

Returning to L,-embeddability, we want to describe the Hamming cone H,. 
The construction showing that it is a cone can be used “backwards” to show that 
every Hamming metric is the sum of Hamming metrics on single atoms. A Hamming 
metric on a single atom Q = {a} is quite simple; normalizing by (a) = 1, every row 
must be represented by either 6 or {a}, and so for an appropriate S C {1,...,}, 
the matrix D looks like 


die 1, ifieS and j ¢ S or vice versa, 
4" 10, otherwise. 


Such a matrix will be called a cut matrix (or cut metric). Our argument shows that 


every Hamming metric is a non-negative linear combination of cut metrics. Hence 
we have: : 


Proposition 2.2. The Hamming cone is polyhedral and its extreme rays are spanned 
by cut metrics. 


In the spirit of polyhedral combinatorics, a next task would be to describe the 
facets of H,,. Unfortunately, no complete description is available; membership in 
H,, is NP-hard to decide (Karzanov 1985). The Hamming cone can be viewed as 
the cut cone of the complete graph, and general results on the cut polyhedron yield 
many facets. Nevertheless, much of the development of the two topics has been 
independent; see Barahona and Mahjoub (1986), Deza and Laurent (1992a,b,c). 

An important class of inequalities with geometric flavour is that D is of negative 
type, Le., for every x € R" we have 


oxi =0 => Ss; >> dj jxjX; <0. (2.5) 
| 


i 


(for L2:-embeddable metrics, we had the squared distances in a similar inequal- 
ity). To see that (2.5) holds for every Hamming metric, it suffices to consider the 
extreme rays of H,, i.e., the cut metrics; and for such a metric, we have 


Dany =20 Six = 2(Sos) (=) 2 2x) a 


ieS jgsS ieS igS ieS 
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Another way to state this inequality is the following. Let i,,..., ix, fis.--, fe be (not 
necessarily distinct) indices from {1,...,}. Then 
YS lipid + SS dips) < So dir) (2.6) 
t<p<rck l<p<rck l<p<k 
- . l<rsk 


(distances between the same kind of points sum up to not more than distances 
between different kinds of points). To derive (2.6) from (2.5), let x; be the number 
of times i occurs among the indices j,, less the number of times i occurs among 
the indices i,; then 5°; x; = 0 and (2.5) implies (2.6). Conversely, (2.6) implies (2.5) 
directly for integral vectors x, from which the rational case follows by homogeneity 
and the general case follows by continuity. 

There is a way to sharpen this inequality, which leads to perhaps the most 
important class of inequalities valid for H, (Deza 1960, Kelly 1970). Let 


biy+s estes Sty-++s dea, be (not necessarily distinct) indices from {1,...,2}. Then 
Yo dlipid+ SY dUpid< YS dlp) (2.7) 
l<p<rgk lep<rck+l i<p<k 
leorckel 


is valid for every Hamming metric. The triangle inequality is just the special case 
k =1. This is why these inequalities are called hypermetric inequalities. 


We can give a linear algebraic formulation of these inequalities analogous to 
(2.5): 


Sox =1, x; integer => So dijxix; <0 (2.8) 
toy 


(the condition that the x; are integral cannot be dropped; else, the inequality 
would hold for every vector x, and so D would be negative semidefinite, which is 
impossible for D #0 as trace(D) = 0). From this formulation, the inequality can 
be proved for cut metrics (and hence for all Hamming metrics) similarly to the 
proof of (2.6) above. 

The hypermetric inequalities hold, in particular, for the Euclidean metric (this 
is not quite obvious to see directly), and have interesting applications in discrete 
geometry (see, e.g., Kelly 1975). 

The convex cone defined by the inequalities (2.5) (or (2.6)) is called the nega- 
tive type cone, while that, defined by the inequalities (2.7) (or (2.8)) is called the 
hypermetric cone. The negative type cone is non-polyhedral for n > 3; but Deza et 
al. (1993) prove the following: 


Theorem 2.3. The hypermetric cone is polyhedral for every n > 2. 


This fact is non-trivial, since seemingly there are infinitely many defining in- 
equalities. To analyze the structure of hypermetric inequalities, let us substitute 
Kn bh : x; in (2.8): then we get that for every x ¢ Z |, 

> (din + djn — dij)xjxj ~ 2 ye dinXi 2 0. (2.9) 


ijjgn-} ign-l 
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It follows easily that the matrix A = (din + djn ~ dij)/ oo is positive semidefinite 


(in fact, this is equivalent to (2.5)). Let E denote the solution set of 


YS int din = dij)xixj -2 S> dink; <0. 


ijgn-t i<n-l 


Property (2.9) implies that F contains no lattice points in its interior. On the 
other hand, it follows by substitution that the zero vector and the unit vectors 
are on the boundary of E. If A is positive definite, then FE is an ellipsoid; in 
general, it is a direct product of a linear subspace with an ellipsoid, which we 
call a generalized ellipsoid. So every hypermetric corresponds to a lattice-point- 
free generalized cllipsoid in R" ', having 0 and the unit vectors on its boundary. 
Conversely, every such generalized ellipsoid yields a hypermetric (uniquely up to 
a scalar factor). Such a generalized ellipsoid corresponds to an extreme ray of the 
hypermetric cone if and only if the system of hypermetric equalities satisfied by d 
admit d as a unique solution (up to a scaling). 

Deza et al. use this description of the extreme rays, together with a theorem 
of Voronoi on the finiteness of affine types of Delaunay polytopes of a lattice of 
a given dimension, to show that the number of extreme rays of the hypermetric 
cones is finite, i.e., this cone is polyhedral. 

An interesting connection between the metric cone and multicommodity flows 
was pointed out by Avis and Deza (1991). We formulate the multicommodity flow 
problem as follows (cf. chapter 2). Consider the complete graph K,, on nodes 
{1,...,2}. For each unordered pair i,j of nodes, we are given a capacity cj; > 0, 
and a demand dj; > 0. So (c;;) and (d,j) are symmctric matrices, and we assume 
that c;; = dj; = 0 for each node i. We want to find a flow fj; from i to j of value 
dj; for every 1 <i <j <n such that 


> |fij(uv)| < Cuw 
ij 


for every pair u,v. We say that the pair (C, D) is feasible if such a system of flows 
exists. (If we want to work on a graph different from the complete graph, we can 
set Cy = 0 for the non-adjacent pairs. Similarly, we set dj; = 0 if we do not want a 
flow from i to j.) Now this is a system of linear inequalities and a characterization 
of feasibility can be obtained from the Farkas lemma. However, this condition 
is not very transparent; Iri (1970) and Onaga and Kakusho (1971) managed to 
replace it by the following elegant criterion (chapter 2, Theorem 8.1): a capacity— 
demand pair (C, D) is feasible if and only if C — D is contained in the polar of the 
metric cone. 

A special necessary condition for feasibility, also formulated in chapter 2, is the 
cut condition: for every partition {S,V \ S}, we must have 


yas Yo 
ieS jEV\S i€S,jEV\S 


It is easy to see that this is equivalent to saying that C — D gives a non-negative 
inner product with every cut metric. Since cut metrics are just the extreme rays 
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of the Hamming cone H,,, this means that C — D is contained in the polar of the 
Hamming cone. (Note that H, C M, and hence H; > M;.) 

Various results on multicommodity flows discussed in chapter 2 are worth re- 
phrasing in terms of these cones (see in particular Theorem 8.6). The Max-Flow- 
Min-Cut theorem says that if a symmetric matrix A has 0’s in its diagonal and has 
only one pair of negative entries, then A € H, implies A € M,. Hu’s theorem on 
2-commodity flows implies that this holds also when A has two pairs of negative 
entries. Papernov’s work can be viewed as a characterization of all patterns of 
negative entries of a matrix in H* \ M;. Other results on multicommodity flows 
exclude certain supports for a matrix in 17; \ M;; c.g., Seymour’s theorem 8.6(g) in 
chapter 2 implies that such a support cannot be the adjacency matrix of a planar 
graph. 


3. Matchings and the Haar measure 


A fundamental notion in measure theory is the Haar measure on locally compact 
topological groups. We shall show that matching theory can be applied to give a 
simple and constructive proof of the existence of the Haar measure in the compact 
case (Rota and Harper 1971). We in fact prove the existence of invariant integra- 
tion; this result, proved by von Neumann, is equivalent to the existence of the 
Haar measure. We refer to Halmos (1950) for measure-theoretic background. For 
further applications of matching theory to measure theory, see also Lovadsz. and 
Plummer (1986). 

Let G be a compact topological group (i-e., a group G endowed with a compact 
topology such that the group operations of multiplication and inverse are contin- 
uous). Let C(G) denote the space of real valued continuous functions defined on 
G. An invariant integration is a functional defined on C(G), having the following 
properties: 

(1) L(af + Bg) = aL(f) + BL(g) (linearity), 

(2) if f >Othen L(f) >0 = (positivity); 

(3) if 1 denotes the identically 1 function then L(1)=1 (normalization); 

(4) ifs,¢ €¢ G and f € C(G), and g is defined by g(x) = f(sxt) then L(g) = L(f) 

(double translation invariance). 


Theorem 3.1. Every compact topological group admits an invariant integration. 


For the proof, we need some notation. If A is a finite set and f: A — R then we 
set 


= i 
f(A) = >- Ff). 
\Al aca 
If # = (V,E) is a (finite or infinite) hypergraph and f: V — R then we set 
d(f, #) = sup{|f(x) — fly)|: x,y € U for some U € #}. 
The combinatorial lemma we use is the following. 
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Cemma 3.2. Let # be a hypergraph and let A, B be two minimum-cardinality 
blocking sets of 3. Assume that A (and B) are finite. Then 


f(A) — f(B)| < 8(f, 2). 


Proof. Consider the bipartite graph whose color classes are A and B, where ae A 
is connected to b C B iff there cxists an cdge U € 3€ containing both a and b. We 
claim that this bipartite graph has a perfect matching. We verify the conditions in 
the Marriage theorem (see chapter 3, Corollary 2.5). It is clear that |A| = [|B]. Let 
T CA and let N(T) denote the set of neighbors of 7 in B. : 

We show that the set A’ = AUN(T)\T is also a blocking set for #. In fact, 
every edge U of # meets A as well as B, since A and B are blocking. Leta eé UNA 
and b« UNB. If a¢ T then we are done. If ae T then b € N(T) and so again 
A’ meets U. 

So A’ is a blocking set and hence |A’| > [A], which implies that |N(T)| > 
|7|. Thus the Marriage theorem applies and G has a perfect matching, say 
{aby pistes ,anDn}. Then 


F(A) — f(B)| = |= =e) 10) 


<- Sista) ~ f(b.) 
i=l 


< bnb(f, 9) =af,#). 0 


Proof of Theorem 3.1. Let U be a non-empty open subset of G. We denote by 
#y the hypergraph with underlying set G, whose edges are the translates of U, 
ie., the sets sUt = {sut: u € U} (s,t € G). A blocking set of #y is called a U-net. 
It follows from the compactness of the group G that there exists a finite U-net. 
Let f € C(G); we want to define the value L(f). 


Claim. Let U and V be non-empty open subsets of G. Let A be a minimum- 
cardinality U-net and B, a minimum-cardinality V -net. Then 


f(A) — f(B)| < 8(f, Hy) + d(f, Hv). 


To prove this claim, observe that Ab is also a minimum U-net for any b € B, and 
hence by Lemma 3.2, 


f(A) — f(Ad)| < 8(f, Hy). 


Hence 


f(A) — f(AB)| = |f(A) a TAD) 


ber 


< pg] LA) Fad) < 8,0). 


beB 
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Similarly, 
[f(B) — f(AB)| < 5(F, Hy), 


and hence {f(A) — f(B)| < 50, Hu) + 6, Hy). 

Now we construct the integration. Let f ¢ C(G). Let U, be a sequence of open 
sets such that 6(f, %y,) — 0 (such a sequence exists by the continuity of f and 
the compactness of G). Choose, for cach n, a U,-net A, with minimum cardinal- 
ity. Then by the Claim, the sequence {f(A,)} satisfies the Cauchy convergence 
criterion and hence it has a limit L(f). The Claim also implies that this limit is 
independent of the choice of the sets U, and Aj. 

We verify conditions (1)-(4). For (1), note that we can choose a sequence V,, 
such that simultaneously 


5(af + Bg, Hy,) > 0, 5(f, Hy,) — 9, &(g, Hy,) > 0. 


Let A, be-a minimum V,,-net. Then 
L(af + Bg) = lim (af(An) + BB(An)) 
=a lim f(An) +B lim 8(An) = aL(f) + BL(g). 


Conditions (2) and (3) are trivial; also, (4) is clear since the whole construction is 
invariant under translations. 0 


4, The minimal degree of primitive permutation groups 


The minimal degree of a permutation group G is the smallest number of points 
moved by any non-identity element of G. This parameter has been subject to 
considerable study since the !9th century [see Wielandt (1964) or the forthcoming 
book by J.O. Dixon and B. Mortimer, Permutation Groups]. 

The minimal degree of the symmetric group is 2 and that of the alternating 
group is 3. In contrast, there are only finitely many primitive permutation groups 
with minimal degree yx for » > 3, and none for 4 = 9,25 or 49. These results were 
proved by Jordan in 1871 and 1874 (see Jordan 1961). He showed (essentially) 
that if G is a primitive permutation group of degree n not containing A,, then 


w> (1400) oor 


Much more is true for doubly transitive groups. Bochert (1889) proved that if G 
is a doubly transitive permutation group of degree m not containing A,, then 
ue ae 1, 
4 
The question remains, can Jordan’s bound be improved for uniprimitive (primi- 
tive but not doubly transitive) permutation groups? Consider the line graph L(K, ») 
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of the complete bipartite graph. The automorphism group of this graph is primitive 
of degree n = v?, and has minimal degree 2v = 2.\/n (consider the automorphism 
of L(K,,) induced by the transposition of two non-adjacent nodes of Ky»). 


This example shows that the following result of Babai (1981) is best possible up 
to the constant. 


Theorem 4.1. If G is a uniprimitive group of degree n then 
w(G) > 5Vn—-1. 


Below we prove the somewhat weaker inequality 4(G) > \/(n — 1)/6. A slight 
refinement of the proof would yield 4(G) 2 (1 + 0(1))./2n/3 

Babai proved the theorem by reducing the group- theoretical problem to a com- 
binatorial question on coherent configurations. We shall give this reduction and 
then a rather simple proof of the result on coherent configurations, incorporating 
some ideas of Zemlyachenko (see Zemlyachenko et al. 1985). 

We need some notions from the theory of coherent configurations (see also 
chapter 15). A coherent configuration & = (;R,,...,R,) isa finite set 2 of vertices 
and a family R,, ..., R, of non-empty binary relations on 2 such that 

(i) {Rj,...,R,} forms a partition of Q x 2; 

(ii) the diagonal A = {(w,w): w € 2} is the union of some of the R;; 

(iii) R;' (the inverse of R;) is one of the Rj; 

(iv) there is a collection of r? integers Pi, such that for every a, B € Rx, one has 


It: (a, 7) € Ri, (7, B) € Ri} = pf, 


(independently of the particular choice of a and B). 

For (a, B) € R; we set i = c(a,B) and call (a, 8) an “edge of color i”. We call 
r the rank of &. The digraphs (2, R;) are the classes of &. A coherent configura- 
tion is homogeneous if R, = A. It is primitive if it is homogeneous and each of the 
classes (2, R;) (i > 2) is connected as a digraph (here we do not have to worry 
which definition of connectedness to choose, since for the classes of a homoge- 
neous coherent configuration connectivity and strong connectivity are equivalent). 
A primitive coherent configuration is uniprimitive if its rank is at least 3. 

The classical examples of coherent configurations arise from permutation groups. 
If G is a permutation group acting on £2, we let R,,...,R, be the orbits of the 
induced action on 2 x Q, to get a coherent configuration &(G) associated with 
the permutation group G. 

Choosing the indices appropriately, &(G) is homogeneous iff G is transitive. 
Connected components of any non-diagonal class correspond to blocks of imprim- 
itivity of G (i.e., classes of G-invariant partitions); hence it is easy to see that 
G is primitive (uniprimitive) iff the associated coherent configuration is primitive 
(uniprimitive). 

The key result of Babai is the following. We say that a vertex w distinguishes 
vertices a and B if c(a,w) #c(B,w). 
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Theorem 4.2. Let 4 be-a uniprimitive-coherent configuration. Then each pair of 
vertices a, B is distinguished by at least \/(n - 1)/6 vertices. 


To see that this result implies Theorem 4.1, assume that & = S(G) and let 
g €G be a permutation moving only a set M of u(G) elements. Let a € M and 
B = a8. Then for w ¢ M we have c(a,w) = c(B,), as g maps (a,w) onto (B,w). 
So a and B are distinguished by at most yx elements. So Theorem 4.2 implies that 
Bw VJ (n-1)/6. 

The main idea in proving Theorem 4.1 was to find the right combinatorial relax- 
ation of the group-theoretic notion of minimum degree. As we shall see, the rest 
of the proof (i.e., the proof of Theorem 4.2) uses only elementary combinatorics. 

To prove Theorem 4.2, we need some notation and a series of simple lemmas. For 
w € £2, consider the number of edges of color i leaving w. As & is homogeneous, 
this number does not depend on w, and we denote it by d;. We write i-! =; if 
R, : ~ R,. In this case dj -= dj. We denote by X; the digraph (Q,R;) and by Xj, the 
undirected graph (Q, R; UR-), Let diam(i) denote the diameter of Xj. 

For two vertices a, B, let D(a, B) denote the set of vertices distinguishing a and 
B. Note that the cardinality of D(a, B) depends only on the color of the pair (a, B) 
(by the definition of coherent configurations); we set f(i) = |D(a, B)| if i = c(a, B). 
Clearly f(i) = f(i-'). We want to prove that f(i) > /(n — 1)/6. 


Lemma 4.3. Let t denote the distance of vertices a and B in the (undirected) graph 
X!. Then f(i) > |D(a,B)|/t. 


Proof. Let a = ay, a@,.. = B be a path of length ¢ in X/. Clearly D(a, B) C 
Ul, P(ai-1, a). Using i = fli ') we obtain |D(a,B)| <¢-f(i). Oo 


This observation leads to the following important inequality: 
Lemma 4.4. /f diam(i) 2 3 then f(i) > 2d,/3. 


Proof. Let a and B be vertices at distance 3 in X}. The sets of neighbors (in X/) 
of a and B are disjoint, therefore these sets are contained in D(a, B). This yields 
|D(a, B)| 2 2d;. So Lemma 4.3 implies that f(i) > 2d;/3. O 


This lemma settles the case when X; has diameter at least 3 and d; is “large”. 


For the case when d; is “small” we need another simple observation. Let I;(a@) 
denote the set of vertices B with c(a, B) =i. 


Lemma 4.5. [f there is an edge with color h from T(a) to Tj(a) then there are at 
least max{d;,d;} such edges. 


Proof. The assumption says that for some vertices B and y we have c(a, B) = 
i, c(a, y) =j and c(B,y) =h. By the definition of coherent configurations, this 
implies that for every B € I;(a) there exists at least one vertex y such that c(a, y) = 
j and c(B, y) = A. Hence there are at least d; edges of color # from [(a) to 1)(a). 
Applying the argument to A !, the assertion follows. © 
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Lemma 4.6. Any vertex distinguishes the endpoints of at least n — 1 edges of each 
color. 


Proof. Consider a vertex w and a color h, and define a graph W on vertex set 
{1,...,r} by joining i to j if there is an edge of color A from Fj(w).to Tj(w). This 
graph is connected since X;, is connected. Let T be a spanning tree of W. Orient 
the edges of T away from 1. By Lemma 4.5, every edge ij of T represents at least 
d; edges of color h distinguished by w. Hence the total number of edges of color 
A distinguished by w is at least 


: 


dod = dod =n-1. Q 


ijeT j=2 


Counting the triples (a, 8B, w) with c(a,B) =i and w € D(a, B) in two ways, and 
using Lemma 4.6 we obtain 


nd:f(i) > n(n - 1), 
which yields one of our key inequalities: 
Lemma 4.7. f(i) > (n — 1)/dj. 
Combining with Lemma 4.4, we obtain: 


Lemma 4.8. /f diam(i) > 3 then f(i) > /2(n — 1)/3. 
What remains is to find a good lower bound on f(i) in the case when diam(i) = 2. 


Lemma 4.9. There exists a pair distinguished by at least \/2(n — 1)/3 vertices. 


Proof. Suppose that d) < d3 <---. Uf diam(2) 2 3 we are done by Lemma 438. If 
diam(2) = 2 then obviously 1 + dz + (dz — 1)d2 > n and hence d) > Vn — 1. Further, 
trivially dz < (n — 1)/2. 

Any given vertex distinguishes at least 2d,(n — 1 — dz) > 2(n — 1)(Vn — 1-1) 


ordered pairs. Hence there must be an ordered pair distinguished by at least 
2(/n — 1 — 1) vertices, ie., fmax 2 2(Vn —1—1) > J/2(n-1)/3. O 


Proof of Theorem 4.2, Choose a pair a, B of vertices with |D(a, B)| > /2(n — 1)/3. 


Let i >2. If diam(i) > 2 then we have f(i) > /2(n —1)/3 by Lemma 4.8. If 
diam(i) = 2 then we have 


n—-1 


6 


fli) > 5\D(a, B)|> 
by Lemma 43. 


Invoking the classification theorem of finite simple groups, Liebeck and Sax! 
(1991) obtained a classification of all primitive permutation groups with p < 1/3. 


\ 
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From this, Theorem 4.1 can be.read off. Still, it is good to have a short solution 
for a classical problem, which helps us understand why the result holds. 

The situation is similar for a related classical problem. Using Theorem 4.2 and 
a simple probabilistic argument, Babai (1981) proved that every uniprimitive per- 
mutation group of degree n has a base of size b < 4,/nlogn. (A base is a subset 
® of 2 such that the only group element fixing every vertex in ® is the iden- 
tity.) This immediately gives an upper bound n° < exp(4,/n log’ n) on the order of 
uniprimitive permutation groups, another remarkable result. 

This bound has been supplemented (Babai 1982) by an even stronger upper 
bound of exp(exp(1.08,/logn)) on the order of doubly transitive permutation 
groups not containing A,. Recently Pyber (1993) found a simple combinatorial 
proof of the bound n‘°e ny using Babai’s ideas. Once more the classification theo- 
rem of finite simple groups gives a stronger bound of n{#0()) log”, But perhaps less 
insight. 


5. Algebras with straightening law 


In this section, we give a brief glimpse into applications of combinatorial techniques 
to some questions of algebraic geometry and commutative algebra. 

Graded commutative rings (that is, quotients A =k[x,,...,%m]/(fi,---sfin)s 
where the f; are homogeneous polynomials) are a central object of study for com- 
mulative algebra. A main motivation for this comes from algebraic geometry, which 
studies projective varieties (that is, solution sets of systems f| = 0,..., fin = 0 of ho- 
mogeneous polynomial equations). The geometry of such a variety is encoded in 
its coordinate ring A = k{x;,...,%m)/7, where J is the idea) of all polynomial func- 
tions that vanish on the variety, and A can be interpreted as the ring of functions 
on the variety (see for example Shafarevich 1977, Kunz 1985). 

For the study of projective varieties and their coordinate rings, algebraic geom- 
etry has a variety of tools available: algebraic, homological, analytic, etc. Combi- 
natorics comes into play in the case of “classical” varieties, such as Grassmann 
and flag manifolds, determinantal and Schubert varieties, and many more. These 
varieties and their rings have additional structure: for example, the varieties are 
very symmetric, and they satify nice smoothness, completeness and regularity con- 
ditions. The coordinate rings can be given by generators and relations in a very 
explicit way, and so it is not too surprising if combinatorial techniques can be 
applied. 


Cohen-Macaulay rings 


Let A be a graded commutative ring, that is, the quotient of a polynomial ring 
modulo a homogeneous ideal, 


A=Kxy,.--,Xml/L- 
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In particular A has a direct sum decomposition A = @, .yA,, where A, is the 
finite-dimensional k-vector space of residue classes of homogeneous polynomials 
of degree k in A, ; 

In this exposition we will, as an example, consider the Cohen-Macaulay property 
of A, which expresses a quite subtle regularity property of the associated variety. 
The usual definition is in homological algebra terms: A is Cohen—Macaulay if 
and only if its depth is equal to its dimension. Here the dimension d = dim(A) is 
the maximal number of algebraically independent elements {@,,...,8,} in A, and 
the depth is the maximal length p = depth(A) of a regular sequence: a sequence 
(0;,...,@p) such that 6; is not a zero-divisor in A/(6,,..., 0; 1) for 1 <i < p. This 
in particular means that p < d: the depth never exceeds the dimension. 

There are many reformulations of the Cohen-Macaulay condition, the most 
explicit perhaps being the following. A is Cohen—Macaulay if and only if it has 
a Hironaka decomposition: for some set {6;,..., 0} of algebraically independent, 
homogeneous elements of A there exist homogeneous elements 7,...,7 such 
that A has a direct sum decomposition 


t 
A=@Q mk[6,,..., Oa). 
i=l 


This means that A is a free k{6\,...,@,|-module, and the set of separators 
{m,.--,m} forms a module basis for A. The maximal regular sequence (6;,..., 94) 
is called a system of parameters in this case. (It turns out that if A is Cohen- 
Macaulay, then any sequence (6),...,0,) with dimy A/(0;,..., 04) < co can be used 
as the system of parameters of a Hironaka decomposition.) 

To show that the coordinate rings of classical varieties are Cohen-Macaulay, 
this offers two alternative approaches. One can compute their dimensions and 
depths — or one can try to construct explicit Hironaka decompositions. The second 
approach is more far-reaching: a Hironaka decomposition of A carries a lot of 
extra structure, such that for example the Hilbert function of the ring can be read 
off. For this we denote the Hilbert function of A by 


H(A,k) := dim, Ax, 
and its Poincaré series (or Hilbert series) by 


Poin(A,?t) = > H(A, ADA, 
k>0 
so the Poincaré series of A is the ordinary generating function (see chapter 21) of 
the Hilbert function. 
A basic result (that can for example be derived from the existence of a finite 


free resolution of A) now states that the Poincaré series is a rational function of 
the form 


‘ Palt 
Poin(A,t) = (i ad 


where P(t) is a polynomial in ¢ with integer coefficients, P,(0) == 1 and P,(1) £0, 
such that the order of the pole of Poin(A,f) at t = 1 is d = dim(A). 
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From this in turn we get a basic property of the Hilbert function: for large k, 
H(A,k&) is a polynomial in k of degree d — 1, with rational coefficients. 

Now, if A is Cohen-Macaulay with A = @}_, 7:&[0),..., 4], then it is easy to 
read off the Poincaré series 


. Ss , 14cr Th 
Poin(A,f) = aaa Te 
(1 ~ 1) 
Note that the numerator polynomial P,(t) has non-negative coefficients in this 


Cohen-Macaulay case — this is not true in general. We can also compute the 
Hilbert function 


s 


k —degnj+d—1 
HUA) = 0 ( rae ) 


in| 


(with the convention that the summands are zero if k < deg n;), which is a poly- 
nomial in k for k > max(deg 7;). 

Let us mention that more general decompositions allow us to treat quite gen- 
eral classes of rings in a similar way, as is described in Bactawski and Garsia 


(1981). Computational aspects of this framework are treated in Sturmfels and 
White (1991). 


The straightening law 


A common structural feature of many of the important coordinate rings of “clas- 
sical varieties” (and all those listed above) is that they have the structure of an 
“algebra with straightening law” over a finite poset — such that the combinatorics 
of the poset allows us to construct decompositions of the algebra A. A reasonable 
level of generality is given by the following definition. It describes what is called 
an ordinal Hodge algebra by De Concini et al. (1982), and an algebra with strongly 
lexicographic straightening law based on a poset by Baclawski (1981). See also De 
Concini and Procesi (1981) and Eisenbud (1980) for introductions, and De Concini 
and Lakshmibai (1981) for a more general set-up. 


Definition 5.1. Let A be a k-algebra and P = (x),...,Xm} a finite poset. Then A 
is an algebra with straightening law over P if A has a presentation of the form 
A =k\x\,...,%m)/1 (identifying the elements of P with generators of A) such that 

(1) the products of variables x;,x;,---x,;, that correspond to multichains x;, < 
++ $x; of P, called standard monomials, form a k-basis of A, and 

(2) the non-standard monomials can be straightened: if for two incomparable 
elements x; and x; of P the monomial x;x; is written as a linear combination of 
standard monomials, then every standard monomial in the expansion contains a 
variable that is smaller than x; and a variable that is smaller than x i 


For every poset P = {x,,...,%m} we get a canonical algebra with straightening 
law, called the Stanley—Reisner ring of P, as 


R[P} = k{x1,- ‘s - Xm) /Tp, 
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where /p is the idcal generated by the non-standard monomials x;x;, corresponding 
to incomparable pairs in P. In this ring the non-standard monomials are all zero, 
so that the straightening law is trivial. 

If A is an algebra with straightening law over P, then A and k[P] are isomor- 
phic as k-vector spaces. However, k[P] is endowed with a rather trivial multipli- 

cation: the product of two standard monomials is either again standard or it is 
' zero (depending on whether the corresponding union of two multichains is again 
a multichain or not). 

Thus the whole algebra structure of k{[P] is determined by the combinatorial 
data of P. This is the case which allows a complete combinatorial analysis. The 
main result now reduces the general case of an algebra with straightening law to 
the corresponding Stanley—Reisner ring. 


Theorem 5.2 (Bactawski and Garsia 1981; De Concini et al. 1982). Let A be an 
algebra with straightening law over P, and k{P| the corresponding Stanley—Reisner 
ring. Then every Hironaka decomposition A = Qj, 1:k[01,.--, 94] of k[P] induces a 
Hironaka decomposition of A via the canonical vector space isomorphism A = k|P}. 


In particular, if the i aca ring k{P] is Cohen—Macaulay, then A is also 
Cohen-Macaulay. 


Thus in order to prove that an algebra A with straightening law over a poset 
P is Cohen-Macaulay, it suffices to establish that the “combinatorial” algebra 
k[P] is Cohen-Macaulay. If parameters 6, and separators ; for k{P] have been 
constructed, then the same parameters and separators also yield a Hironaka de- 
composition of A. 

The work by De Concini et al. contains a careful algebraic analysis of the tran- 
sition from A to k[P]. It shows in particular that in fact both algebras have the 
same dimension, whcreas only depth(A) > depth(&[P]}) holds in general. 


Cohen—Macaulay posets 


At this stage of the argument, combinatorial methods take over to decide whether 
k{P] is Cohen—Macaulay, and possibly to construct decompositions. 


Definition 5.3. P is a Cohen—Macaulay poset (with respect to k) if k[P] is a Cohen— 
Macaulay algebra. 


Cohen-Macaulay posets were introduced independently by Bactawski and Stan- 
ley, aroun” 1975. Subsequent intensive research has produced a wealth of powerful 
resp” “ia and techniques. For surveys, we refer to Bjérner et al. (1982) and to 

‘cal Methods” chapter 34 by Bjérner, where Cohen—Macaulay posets 

some detail (emphasizing the topological approach). Historical de- 

‘acts needed in the following are also discussed there. We will here 
cts. 

‘ery strong conditions on the combinatorics of a poset to allow 

aulay property. In fact, P has to be ranked (of rank r(P) = 
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r =dim&k{[P]): this means that every maximal chain x, < x2 <---<.x, in P has 
the same length r — 1. Given such a maximal chain, we will write r(x;) =i for 
1 <i <r. This defines the rank r(x) uniquely for every x € P. A rank selection of 
P is a subposet Ps := {x € P: r(x) € S} for some S C [r]. Also, denote by P the 


poset obtained by adding a new maximal clement 1 and a new minimal clement 0 
to P. 


Proposition 5.4. If P is a Cohen-Macaulay poset, then so is every rank selection of 
every interval of P. : 


Now a strong numerical test has to be satisfied by P (and all its rank selections 
of intervals). For this, say that the Mébius function (see chapter 21, and Stanley 
1986) alternates on P if for any x,y € P one gets w,(x,y) -(—L0)"™ 30. 


Proposition 5.5. If P is Cohen—Macaulay, then the Mébius function alternates on 
P: 


Secondly, there is a complete characterization of Cohen—Macaulay posets in 
terms of the topology of the simplicial complex A(P) of chains in P, given in 
section 10.8 of chapter 34. In particular, Cohen—-Macaulayness is a topological 
invariant of |A(?’)|. 

And finally, the techniques of shellability and lexicographical shellability allow 
us to explicitly prove Cohen—Macaulayness and to construct Hironaka decompo- 
sitions for all major classes of Cohen-Macaulay posets. We will use the following 
formulation of shcllability. 


Definition 5.6. A poset P is shellable if it is ranked and its set “4 of maximal chains 
admits a linear ordering = (C,,C2,...,C,) such that every chain C; contains a 
unique minimal subset F; that is not contained in any previous chain C; (j < i). 


The definition immediately implies that for the first chain C; one gets F; = 9, 


whereas every chain C; with i > 1 satisfies F; 4 9 and contains at most one “new” 


vertex. It is not quite trivial to see that the Mobius function alternates on P. 
However, the following result implies this. 


Theorem 5.7. (Ifochster 1972; Stanley 1975; Garsia 1980; Kind and Kleinschmidt 
1979). If P is shellable, then it is Cohen-Macaulay with respect to every field k. 
In this case we can derive parameters 0; ?= Yoy,)~i * for 1 <i <r and separators 
qo= Ther, x forl <j <t froma shelling of P as above. 


This yields, for any shellable poset P, an explicit Hironaka decomposition of 
k{P] and thus - with Theorem 5.2 — of any algebra with straightening law over P. 

For shellability techniques we again refer to chapter 34 and the references given 
there. It turns out that very general classes of posets can be shown to be shellable, 
see section 11.10 in chapter 34. A very powerful result that covers the posets arising 
from many classical varieties was achieved by Bjérner and Wachs. 


Theorem 5.8 (Bj6rner and Wachs 1982). The Bruhat order of any finite quotient of 
a Coxeter group by a parabolic subgroup is (lexicographically) shellable. 
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Example: Grassmann varieties 


We will illustrate this approach by the most “classical” case of the coordinate rings 
of Grassmann varieties. 

-For this let k be a field of characteristic zero, and consider the pth exterior 
product of k’, denoted A,k”. It is a k-vector space of dimension (1), with an explicit 
basis given by {e;, A---Ae;,: 1 <i) <--- < ip <n}. The corresponding coordinate 
functions for A,k” are denoted by [i ---i,]. Thus a general (antisymmetric) tensor 
in Apk" has the form @ = 30) <j, <..cipgnlér ---tp] €i, A+ AN G,- 

An extensor is a non-zero, decomposable tensor in A,k’, that is, a tensor of 
the form w =v, A--- Av, for p linearly independent vectors v; € kK’. Two p-tuples 
(v1,...,0p) and (v),...,0,) determine up to a scalar the same extensor (that is, 
v) A---AUp =c-¥ A--- Av), for some c 4 0) if and only if they span the same 
p-dimensional subspace of k”. 

In coordinates, for v; = pare ue ;, one gets 


DI A+-- AD, = ss fir ---ipje;, Avo Ae 


I<ij<-<ip San 


ip? 


Vii, Uli, «- Vii, 
; . U2), V2, wad ¥2;, 

where |i, ---i,| = 
Vpi, Uni, aoe Upi, 


The vector of 9) coordinates (i) ---ip]: 1 <i) <--- < ip <n) is called a vector of 
Pliicker coordinates for V := span(v,,...,Up). 
Passing to the projective space of Apk”, we find that the set 


Gyn = {[B1 A-+-A vp] € P(Apk")} 


is in natural bijection to the set of p-subspaces of k". As the image of a homo- 
geneous polynomial map (which sends the matrix (v,;) to the vector {{i; ---ip]) of 
all its maximal minors) this Gp, is an irreducible projective variety, called the 
Grassmann variety of p-subspaces in k". This variety turns out to have dimen- 
sion p(n — p), in an ambient space of dimension () — 1. Note that G,,, = P(k’) is 
projective space. 

The coordinate ring of the projective variety Gp, is 


A(Gpn) = Kl lis ip] 1 Si < ++ <ip S/o, 


where J,,, is the ideal of relations between the (p x p)-minors of a generic (p x n)- 
matrix, that is, between the variables [é, ---i,] if they are given by 


Vii --- Vly 


Upi, +++ Uni, 


te TT TS 
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Monomials of degree ¢ in. A(G,,,) are customarily denoted by (¢ x p)-arrays, called 
tableaux, whose rows denote the names of the variables. So [i ---ip} - {j1--- jp) will 
be denoted as (i. ooh etc. 


The basic theorem now is ‘that the ideal /,,, is generated by the straightening 
syzygies: 


ie |= Sen [i ee oH eeu (*) 


where / is an arbitrary fixed position (1 < / < p), and the sum is over all shuffles 
o of (fj ---jlit--- ip) except for the identity, that is, over all permutations o ¥ id 
of the (multi)set {j\,...,f:,#,---,4p} that satisfy o(j,;) <---< o(j;) and o(i,) < 

‘+ <a(i,). Here we define [i,(1) ---ig¢p)] := sign(o)[i, ---ip] for p-tuples that are 
not increasing, and [i, - --/)] = 0 whenever two entries i; are equal. 

{In fact, the ideals [,,, are already generated by the Grassmann-—Pliicker rela- 
tions that are obtained from (*) in the special case / = 1. However, these are not 
sufficient for straightening —- see Sturmfels and White (1989). 

We want to conclude that A(G,,,) is an algebra with straightening law. For this 
we define a partial order on the set of variables, by putting [i ---ip] < [/i--- Jp] 
whenever i; < ji,...,ip < Jp. This makes 


Aon = ({lir-- tp] 1 <b < +++ < tp <n}, <) 


into a partial order — in fact, A,,, turns out to be a distributive lattice. In particular, 
Ap,n is ranked: its rank function is given by r([i, ---ip]) = 14+ @ — 1) +---+ (ip — p), 
and Ap, has length r({n — p+1,...,n]) = 1+ p(a — p). 

Furthermore, A,,, is shellable: shellings for distributive lattices are easily ob- 
tained by a lexicographic technique; also — and this is the argument that gener- 
alizes for other classical varieties — A,,, is the quotient of (the Bruhat order of) 
the Coxeter group ¥, by its maximal parabolic subgroup F, x F,_, and hence 
shellable as a special case of the Bj6rner-Wachs theorem 5.8. 

Now reconsider the tableau corresponding to a monomial in A(G,,,). The rows 
of such tableaux are variable names and thus strictly increasing, and a tableau is 
standard (that is, corresponds to a standard monomial) exactly if the columns are 
all non-decreasing (if read from top to bottom). 

With the facts we have collected now, it is not hard to verify that A(G,,,) is an 
algebra with straightening law over A,,,: in fact the Grassmann-Pliicker relations 
(*), iterated, show that the standard monomials span A(Gp,,), and they also provide 
the straightening law for A(G,,): for this consider any monomial of degree 2 
that is not standard, that is, a tableau eae ‘i a for which i, > j, for some J. In 


this situation one has j| <---< jp <i; <- a ip, and the formula (*) expresses 
the tableau as a linear Simbination of ableuie whose top rows are all smaller 
than [i; ---i,]. Iterating this procedure, we get an expression of ie : aa as a linear 
combination of standard tableaux whose top rows are smaller than li, ip). (This 
shows that the standard monomials span A(G,,,).) But we could have apical the 
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same procedure to A 7 de] as well, to get the same expression again (because the 
standard monomials form a basis!). Thus the top rows of all the standard tableaux 
in the expansion of [i + r| are also smaller than [j; --- jp], which shows property 
(2) of an algebra with straightening law. 


Thus Gp, is a Cohen—Macaulay variety, much of whose structure is (in a subtle 
way) controlled by the poset Ap jn. 


For an explicit example, consider the Grassmann variety G35. Its coordinate 
ring is 


A(G3,5) = A{ [123], [124], [125], {134], [135], [145], [234], [235], [245], (345]]/J3.5, 


where the ideal /;5 is generated by Grassmann-Pliicker relations like 
145 134 124] _ 
[33a] - [24s] + [345] =2 


Now A(G;5s) is an algebra with straightening law over the poset A; 5: 


[345] 

| 

(245} 
fee 
[145] [235] 
Soe OS 

(135] (234] 
ae ae ee 
[125] [134] 

{124} 

| 

[123] 


It is easy to get a (lexicographic) shelling: take 


Cy = ((123] < [124] < [125] < {135] < [145] < [245] < [345]) 
C2 = ([123] < [124] < [125] < [135] < [235] < [245] < [345]) 
C3 = ({123] < [124] < [134] < [135] < [145] < [245] < [345]) 
C4 = ((123] < [124] < [134] < [135] < [235] < [245] < [345}) 
Cs = ({123] < [124] < [134] < [234] < [235] < [245] < [345}) 


This yields parameters m = 1, m = [235], m3 = [134], ms = (134][235], ns = (234], 
and separators 0; = [123], 0 = [124], 6 =[125]+ [134], 6, = [135]+ [234], @5 = 
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[145] + [235], 0 = [245] and @; = (345), which describes a Hironaka decomposition 
of the Cohen-Macaulay algebra A(G35), with d =7 and ¢ = 5: 


A(G35) =k{[123], [124], [125] + [134], [135] + [234], [145] + [235], [245], [345]] 
® (235]k[[123], [124], [125] + [134], [135] + [234], [145] + [235], [245], [345]] 
® [134] k([123], [124], [125] + [134], [135] + [234], [145] + [235], [245], [345]} 
® (134](235]k([123], [124], [125] + [134], [135] + [234], [145] + [235], [245], [345] 
© [234]k[[123], [124], [125] + [134], [135] + [234], [145] + [235], [245}, [345] 
From this we can read off the Poincaré series as 


143¢+22 
Poin(A(G355),t) = “(=a * 


and the Hilbert function as 
k +6 k+5 k+4 
H(A(G355),k) = ( 6 ) +3( 6 \s ( é J: 
6. Complex hyperplane arrangements 


Combinatorial! properties of arrangements of hyperplanes in real linear space have 
been studied in geometry for quite a while. Several properties of such arrangements 
are described in chapter 17. A more recent development is the application of 
combinatorial techniques to the theory of complex hyperplane arrangements. 

In the course of investigation it has turned out that deep structural properties of 
arrangements are controlled by their matroids, and combinatorial methods allow 
us not only to compute important invariants (such as the cohomology algebra of 
the complement, and the homotopy type of the link) but also identify correctly 
extreme cases with special structure (supersolvable and 3-generic arrangements). 

In the following sketch we focus on the links and connections between combina- 
torial data (hyperplane arrangements as linearly represented matroids), algebraic 
structure (hyperplane arrangements as singular varieties) and topological aspects 
(complements of arrangements as complex manifolds). 

We refer to the lecture notes by Orlik (1989) and the book by Orlik and Terao 
(1992) on arrangements for further details, expositions and extensive bibliogra- 
phies. 

For this section we restrict our attention to the following scenario (although 
greater generality is possible and natural for many aspects). 


Definition 6.1. A hyperplane arrangement is a finite set X = {H),...,H,,} of hy- 
perplanes (vector subspaces of codimension 1) in a complex vector space V = C", 
with (\j., H; = {0}. 
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This definition describes complex arrangements (it is also interesting to study 
arrangements over other fields, the case of arrangements in real vector spaccs 
being the best studied one). The arrangements considered here are linear, because 
all hyperplanes are required to contain the origin, and they are essential: the last 
condition requires that the dimension n-- dim(V) coincides with the rank r ~ 
codimy ((¥_, H;) =n — dim(H, N---NH,) of the arrangement X. These last two 
conditions (linear and essential) are not very restrictive — usually questions are 
easily reduced to this case. There are, in fact, standard constructions that transform 


from affine or projective arrangements to linear ones, and every arrangement has 
a canonical essential arrangement associated with it. 


Choosing coordinates x|,...,Xn € V* for V, we write § :=C[x;,...,x,| for the 
ring of polynomial functions on V. For every hyperplane H € X we can then 
choose a linear form £;, € V* whose kernel is H, which may also be seen as a 
linear function @;; € S that defines H = {x € V: @;;(x) =O}. 

The product Q := [],,-y & € S is thus interpreted as a defining equation for the 
arrangement X: it defines the hypersurface obtained by the union of the hyper- 
planes in X. Note that both Q and the @,, are unique up to a non-zero complex 
factor. 

We will in the following sketch structures associated with hyperplane arrange- 
ments by various branches of mathematics. The common questions will be: 

- How much of the structure of a hyperplane prangement is encoded in and 
determined by its combinatorics? 

— Do the combinatorial data allow simple construction or computation of the 
relevant information? 

— Do the combinatorial invariants identify interesting special cases with additional 
structure? 

In the light of these questions, the following three sections will first consider the 


combinatorics associated with an arrangement, then the topological properties, and 
finally algebraic structures. 


The matroid of an arrangement 


The combinatorial structure associated with a hyperplane arrangement is that of 
a matroid (cf. chapter 9). [This contrasts the case of real arrangements, where 
much more information is encoded in the associated oriented matroid (cf. chapter 
9, section 15, chapter 34, section 7, and Bjdrner et al. 1993).] 


Definition 6.2. The matroid M = My of the arrangement X is given by linear 
independence on the set {@,: H € X} of vectors in V*. A property of arrangements 
is combinatorial if it is determined by the matroids of the arrangements alone. M 
is a matroid of rank r(M)=n on a ground set of m= |X] elements, which we 
identify with X. Important invariants for the following are its lattice of flats, which 
is isomorphic to the intersection lattice 


={nvvex} 
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(ordered by reverse inclusion), its characteristic polynomial 


X(t) = SH 0,4)" a So wi(-1it" a 


xeEL i=0 


and the broken circuit complex BC(M) © 2'!2-", a simplicial complex with ex- 
actly w; faces of cardinality /, which implies w; > 0 for 0 <i <n. 


We will now introduce three classes of hyperplane arrangements that are distin- 
guished not only by combinatorial conditions, but also by algebraic and topological 
ones, as we will soon see. 

- A hyperplane arrangement is supersolvable if its matroid is supersolvable [in 
the sense of Stanley (1972), that is, if its intersection lattice contains a maximal 
chain of modular elements}. 

A key fact is the factorization y(t) = [];_,(¢ — ¢;) for positive integers e; in this 

case. This can be explained by a factorization of the broken circuit complex 

which characterizes supersolvability, according to Bjérner and Ziegler (1991). 

Supersolvable arrangements form a very interesting class of highly structured 

arrangements. For example, if G is a graph on {1,...,”}, then the arrangement 

of hyperplanes x; = x; (for ij € E(G)) is supersolvable if and only if the graph 
is chordal. 

- An arrangement is generic if arbitrary small perturbations do not change the 
combinatorial structure. This is equivalent to the property that the matroid M 
is uniform, such that M © Uni. 

A weaker condition is that no hyperplane of the arrangement contains the 

intersection of two other hyperplanes. Equivalently, we may require that M 

contains no 3-circuits, which defines 3-generic arrangements. 

The 3-generic arrangements are a large class of arrangements with very little 

(combinatorial) structure. 

- A third class of “special” (although not combinatorially defined) complex ar- 
rangements arises by complexifying simplicial real arrangements (that is, by field 
extension for arrangements that subdivide R” into simplicial cones). Such ar- 
rangements arise, for example, from the actions of finite Coxeter groups (groups 
generated by reflections), when one considers the arrangement of all hyper- 
planes of reflections in the group. 

Complexified real arrangements can be treated in terms of the combinatorial 

data given by the real hyperplane arrangements. This is the reason why they 

are much better understood than general complex ones. 


Topology of the complement 


Consider the hyperplane arrangement X as a complex hypersurface (of real codi- 
mension 2!) in V. Then T := V\X, called the complement of the arrangement, 
is a connected open complex manifold with interesting cohomology and homo- 
topy properties. Closely related to this one studies the link D:= XNS*"-', the 
intersection of the arrangement with the unit sphere in C”. 
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Note that for the real analogue (a linear arrangement in R”), this complement 
is a union of disjoint open convex cones, and so its topological structure is entirely 
determined by the number of its components, which is described by Zaslavsky’s 
theorem (see chapter 17). The link has the homotopy type of a wedge of (n — 2)- 
spheres. In contrast to this, no complete description of (say, the homotopy type 
of) the space T is known for the complex case. However, one can describe the 
homotopy type of the link in this case, see below. 

In the following, we will present a complete combinatorial construction of the 
cohomology algebra of 7, and some interesting partial results for the homotopy 
structure. 

The construction for the cohomology algebra can be sketched as follows. Let 
E be a free Z-module with basis {e),...,¢@m} in bijection with X. Let AE be the 
exterior algebra over E, with basis {ex: K C [n]}. Denote by J the ideal of AE 
which is generated by all the elements of the form (ec), where C is a circuit of 
M. Here the boundary map @ is defined by linear extension of 


p 


Alex) = Y0(-1 ex -ti, for K = {iting sip} 
j=l 


Now define the Orlik-Solomon algebra of X as A(M) := AE/I. This algebra is 
combinatorial, since it is constructed from matroid data only. It provides a model 


for the cohomology algebra of 7 which is combinatorial in the sense of Defini- 
tion 6.2. 


Theorem 6.3 (Orlik and Solomon 1980). Let X be a complex arrangement, M its 
matroid and T its complement. Then the cohomology algebra of T (with integer 


coefficients) is isomorphic as a graded Z-algebra to the Orlik-Solomon algebra 
A(M): 


H*(T,Z) =~ A(M). 


This fundamental theorem allows a complete analysis of the cohomology of 
T with well-developed combinatorial tools. So one finds that the broken circuit 
complex of M induces a basis of A(L) and hence of the cohomology algebra 
H*(T,Z). This follows from Bjérner (1982) and Orlik and Solomon (1980) and 
was rediscovered by Jambu and Terao (1989). In particular this proves that the 
Betti numbers of 7 are given by 


B'(T) = rank H'(T,Z) = wi, 


so that the Poincaré polynomial of H*(7T,Z) is t"y(—1/t) (Orlik and Solomon 
1980). It can also be used for an elementary proof of Theorem 6.3, see Bjérner 
and Ziegler (1992). 

With observations of this type there is a complete analysis of the cohomology 
algebra of 7 with matroid theory tools. Similarly, the following result shows that 
the homotopy type of the link D is combinatorial. 
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Theorem 6.4 (Bjérner and Ziegler 1992, Ziegler and Zivaljevié 1993). The link D = 
X0S*""' of every complex arrangement has the homotopy type of a wedge of 
spheres, put together from w; spheres of dimension 2n — 2 —1, for i > 0: 


pee AE so eal 
Ac BC(M)\O 


Here for n = 2 the wedge is formed to produce a disjoint union of circles. Oth- 
erwise, the link is connected and the homotopy type of the wedge does not depend 
on the choice of wedge points. 

The approach of Bjérncr and Ziegler (1992) produces explicit spheres S, in the 
link D that induce the homotopy equivalence. The “diagram method” of Ziegler 
and Zivaljevié (1993) works in a much more general situation. It produces homo- 
topy formulas for links of arrangements of arbitrary real subspaces, and the above 
is just a special application of their method. 

Using Alexander duality (see Spanier 1966), one gets from Theorem 6.4 the 
linear structure of the cohomology algebra of 7. However, the multiplicative struc- 
ture described by Theorem 6.3 cannot be derived from the homotopy type of the 
link: it encodes subtle details about the complex structure of the arrangement, see 
Ziegler (1993). 

Also, via Spanier-Whitehead duality one gets from Theorem 6.4 that the comple- 
ment 7 has the stable homotopy type of a wedge of spheres (i.e., after a sufficient 
number of suspensions it is homotopy equivalent to a wedge of spheres). However, 
a complete description of the homotopy type of 7 seems to be out of reach. In 
fact, the following basic problems are open, see Falk and Randell (1985), Salvetti 
(1987). 

- Is the homotopy type of a complex arrangement combinatorial? That is, do ar- 
rangements with isomorphic matroids always have homotopy-equivalent com- 
plements? 

- In particular, is the fundamental group 7,(7') combinatorial? That is, can one 
give a presentation of 7(7) from the knowledge of M alone? 

Without a positive solution to this problem, more data than just the matroid are 
neccssary to construct the homotopy type of an arrangement. For example, in the 
case of complexified real arangements, one can construct T up to homeomorphism 
using the additional data given by the oriented matroid of the real arrangement. 
This was shown by Bjérner and Ziegler (1992), extending an earlier similar result 
of Salvetti (1987) for the homotopy type. 

In the general complex case, several non-trivial results have been given, and 
there is very active research going on. See, for example, Arvola (1992) and Falk 
(1993) for recent progress. Much of the attention centers on the first homotopy 
group 1(7), together with the question under which conditions T is a K(z, 1) 
space, that is, the higher homotopy groups 7;(7T) (i > 1) vanish. 

Crucial parameters are given by the lower central series of 1(T), defined by 
Cy = rank(G,/G,,), Where G, = 7(T), and G,,; = [G,,G;] is the subgroup of 
G, generated by the commutators ghg“'h~' for elements g € G, and h € G). 
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Theorem 6.5. Let T be the complement of a complex arrangement X. 

(1) (Hattori 1975) If X is generic and |X| > n > 2, then T is not a K(a,1) space: 
in this case 1;(T) =0 for 1 <i <n, and 1,(T) is free Abelian on an infinite number 
of generators. 

(2) (Deligne 1972) If X is a complexified simplicial arrangement, then T is a 
K (a, 1) space. 

(3) (Terao 1986, Falk and Randell 1985) If X is supersolvable, then T is a K(a,1) 
space, and its lower central series is given by T2.a —v)9 = TTR, — ext) = 
t"y(1/t). 


The combinatorics underlying Deligne’s Theorem 6.5(2) was clarified recently 
by Paris (1993a,b), see also Cordovil (1994) and Salvetti (1993). 


It is safe to say that the homotopy groups of 7 are in general extremely com- 
plicated objects. We just mention that the formula in part (3) of this theorem fails 
even for very nice complexilied simplicial arrangements. 

The homotopy structure of T is, nevertheless, a very promising field of research, 
and combinatorial approaches and methods should be helpful to attack some basic 
open problems, the most striking ones being the following “K (a, 1)-problems” (of 
which the last is a special case of the problem mentioned before Theorem 6.5): 

— Does Hattori’s result (Theorem 6.5(1)) generalize to 3-generic arrangements? 

Or can T be a K(m,1) for some 3-generic arrangement? 

- Describe necessary and sufficient conditions for T to be a K(a, 1). 
~- Does the matroid alone determine whether the complement of a complex ar- 
rangement is a K(a, 1)? 


The module of logarithmic vector fields 


An algebraic structure of interest here is the S-module of algebraic vector fields 
that are tangent to the hyperplanes. Saito’s (1980) investigations in singularity 
theory first suggested the study of these modules of logarithmic vector fields and, 
dually, of logarithmic differential forms at a hypersurface singularity. Specialization 
to the case of hyperplane arrangements led Terao (1980) to his fascinating theory 
of free hyperplane arrangements. 

The following module captures a lot of structure of the arrangement. Its control 
by combinatorial data is quite strong, but not straightforward. 


Definition 6.6. Let X be a complex arrangement defined by O =[[y,-y 4 € S. 
The S-module of logarithmic vector fields at X is the set of derivations 


perix) = {0 =D nge : o10(o)} 


with the obvious S-module structure. 


There is also a very simple, geometric description: Der(X) is isomorphic to the 
module of all polynomial maps p:C" —+ C" that map every hyperplane of X into 
itself. 


| 
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For this, p is considered as an n-tuple p = (p1,..., Pn) € S" of n-variable poly- 
nomials; the corresponding element in Der(X) is 6) = 57, p:(0/Ox;). To see that 
the set of such maps p is an S-module, one has to show that S-linear combinations 
have the same form. 

The following “basis criterion” of Saito (1980) identifies the case when Der(X) 
is a free module (that is, has a basis). 


Lemma 6.7. The logarithmic vector fields 0,...,0,€Der(X) with 6, = ja 


pij(0/0x;) form a basis of Der(X) if and only if det(p;;) =cQ for a non-zero 
constant c € C. 


This situation arises in many important examples and provides an algebraic 
criterion for “strong combinatorial structure” in an arrangement. 


Definition 6.8 (Terao 1980). The arrangement X is free if Der(X) is a free S- 
module. 


Der(X) is an S-module of rank n (every maximal S-linearly independent subset 
has cardinality S), such that the basis criterion above characterizes (bases of) free 
arrangements. 

At this point we observe that it pays off to also study the dual module 2!'(X) = 
Homs(Der(X), 8) of logarithmic 1-forms at X, because it has additional structure. 
0'(X) is the S-module of differential 1-forms w = (1/Q) 377_., gidx; with q' € S, 
such that dw, like w, has at most a first order pole at every hyperplane H € X. 
This is equivalent to requiring that the restrictions (Qw)|,; vanish. The differential 
forms dé,,/;,; for H € X are some obvious elements of '(X). 

There is a non-degenerate pairing between Der(X) and !(X). In particular 
'(X) is free if and only if Der(X) is free, which offers a second approach to free 
arrangements. 

Both Der(X) and '(X) have the natural structure of a graded module, where 
@ € Der(X) is homogeneous of degree e if @ = 5>;_, pi(@/Ox;), where all the p; 
are either 0 or homogeneous of degree e. Similarly, a logarithmic differential form 
w = (1/Q) yy, q'dx; has degree e — m if all the polynomials q' are either zero or 
homogeneous of degree e: here m = |X| is the degree of Q. 

So for every H € X, dé;;/€, is a form of degree —1, whereas (1/))(d&/e— 
dé; /;) is a form of degree —2 in '(X) if and only if the linear forms @,, &, and 
£; are dependent, so the corresponding hyperplanes satisfy H, D(H. H3), that 
is, they form a 3-circuit in My. 

Now if Der(X) 1s free, then it has a basis {0,,...,0,} Consisting of homoge- 
neous vector fields 0; = )77_, pij(8/Ax;) of respective degrees ¢;. Similarly, (Xx) 
then has a homogeneous basis {w;,...,@,} with degrees deg(w,;) = —e;, which is 
explicitly given as w; = >j'_, q'/dx;, where the matrices (q‘/) and (p;;) are inverses 
of each other. 

Terao’s (1981) remarkable “Factorization theorem” now implies that the degrees 
of these homogeneous basis elements — if X is free — can be computed combina- 
torially. 
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Theorem 6.9 (Terao 1981). If X is a free arrangement, then the characteristic poly- 
nomial of Mx factors as ; 


x(t) = Te - ei), 


where the e; are the degrees of the vector fields in a homogeneous basis of Der(X). 


This result in particular allows us to disprove freeness for large classes of ex- 
amples. It also suggests classes of arrangements that “ought to be free” — because 
they meet the strong combinatorial criterion that y(t) factors over Z. 

The original proof of this result was very difficult. New ideas by Solomon and 
Terao (1987) leading to Rose and Terao’s (1990) concept of a “Mobius sum with 
Hilbert coefficients” put this into a simpler and broader framework. 

Theorem 6.9 is part of a bulk of evidence for the following conjecture that has 
motivated a lot of research on free arrangements. 


Conjecture 6.10 (Terao’s conjecture). Freeness is a combinatorial property. 


Extensive work has gone into this conjecture, without final success, yet. Special 
cases can be settled: for example, arrangements with a graphic matroid are free if 
and only if they are supersolvable (Stanley, see Edelman and Reiner 1994), and 
Terao’s conjecture is true if the matroid is binary (Ziegler 1990). Edelman and 
Reiner (1994) characterized freeness for a larger class of arrangements (described 
in terms of graphs), which includes both free and nonfree arrangements. Their 
analysis also provides classes of counterexamples to “Orlik’s conjecture”: it is not 
true that the restriction of a free arrangement to one of its hyperplanes always a 
free arrangement; see Edelman and Reiner (1993). 

Furthermore, ten years of research have clarified the structure of the Saito— 
Terao modules quite a bit, and in this course the importance and the scope of the 
underlying combinatorial structure have become more apparent — see also Ziegler 
(1990) and Yuzvinsky (1993). It has also led to algebraic characterizations for the 
two combinatorial classes of arrangements discussed above. 


Theorem 6.11 (Stanley 1984; Jambu and Terao 1984; Ziegler 1989). Let X be a 
complex arrangement. 

(1) X is supersolvable if and only if it is free with a basis {0,...,0,} of 
Der(X) that (in suitable coordinates) has an upper-triangular coefficient matrix: 
0; = Dj Pij(O/Ax;) for 1 <i < ne 

(2) X is 3-generic if and only if the module O'(X) does not contain forms of 
degree —2, that is, if it is generated by {dé,; /€;;: H € X}. 

Thus a 3-generic arrangement (and, in particular, a generic arrangement in di- 
mension n > 3) can never be free unless n = m. 


The role of the complexified simplicial arrangements is less clear in this context. 
Terao (1980) has shown, analyzing the n =3 case, that many, but not all such 
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arrangements are free. This was motivated by the 

a K(a, 1) space for every free arrangement XY. This w. 
Reiner in 1994. The converse was disproved by Terao . 
and Deligne’s theorem 6.5(2). 

Thus it is still an open problem to. directly relate the a s 
rangements with their homotopy types. A Jot of work will As 
elaborating on the interpretation and relevance of combinato. 
plex structure of hypersurface singularities. Combinatorial, di. 
many of the algebraic and topological properties. To understanc 
also in much more general settings than the mode) of this section, wd 
a deeper understanding of the connections between singularity theo, opol- 


ogy, reflection groups, and a zoo of other topics that we have not e -« touched 
upon here. 


ers 


7. Knots and the Tutte polynomial 


In this section we present a recent development in knot theory, which has turned 
out to be very closely related to classical combinatorial constructions like the Tutte 
polynomial of a graph. We will not have the opportunity to discuss another exciting 
development: the recent approach of Vassiliev (1990), which gives a systematic 
method to produce knot invariants, and reduces in a completely different way to 
combinatorial problems, see also Birman and Lin (1993). We refer to Burde and 
Zieschang (1985) for a broad development of the “classical” knot theory. More 
extensive surveys of the recent progress are Kauffman (1988), Lickorish (1988) 
and Birman (1993). 

A link L with c(L) components consists of c(L) disjoint simple smooth closed 
curves in R3. A knot is a link with one component. The natural way to represent 
a link L is by means of a link diagram, which is obtained from L by projecting it 
onto a plane in such a way that the projection of each component is smooth and at 
most two curves intersect at any point. At each crossing point of the link diagram 
the curve which goes over the other is specified as shown in fig. 7.1. 

Two links are equivalent if one can be deformed continously to the other in three 
dimensional space. A knot is trivial if it is equivalent to the knot without a crossing, 
the unknot “O”. A link is trivial if it is equivalent to a disjoint union of trivial, 
unlinked knots, that is, to a link without a crossing. The union “U” operation used 
for this places two link diagrams into the plane so that they do not touch. 

Clearly a link diagram represents a unique link up to equivalence, but many dia- 
grams can represent the same link. Modifying a link diagram locally as represented 
in fig. 7.2 does not change the link represented; these local changes are known as 


the Reidemeister moves of types 1, II and III. The classic theorem of combinatorial 
knot theory its: 


Theorem 7.1. Two link diagrams represent equivalent links if and only if one can 
be obtained from the other by a finite sequence of Reidemeister moves. 
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The trefoil knot The Borromean rings: a 3-component link 


Figure 7.1. 


f NS ed aes 


Yo 


Figure 7.2. 


Reidemeister’s classic theorem is an existence theorem: it does not provide an 
algorithm of any “time-bounded” complexity for testing whether or not two links 
are equivalent, sce for exampic the remark in Burdc and Zieschang (1985). 

As a result, any invariant f of a link which is both easily calculated and has the 
property that L; and L2 are equivalent only when f(L;) = f(L2) is clearly of great 
Significance in the theory of knots. 

Here we shall present an easily derived “partial invariant”, namely the Jones 
polynomial, discovered by Jones (1985). In order to do this we introduce first the 
closely related bracket polynomial introduced by Kauffman (1987). 


The bracket polynomial 


For any link diagram L define a Laurent polynomial (L) in one variable A, which 
obeys the following three rules: 


(i) (0) =1, 
(ii) (LUO) = —(A? + A~)(L), 
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ci) (K) = AQ) +4 KA) 


Notes. (I) his Laurent polynomial in z is a polynomial featuring positive powers 
of z and z~!. 
(II) The rule (iii) is applied locally, that is, at each crossing of the link diagram. 
(III) Rules (i) to (iii) recursively define the bracket (L) for every link L: (i) starts 
the recursion with the unknot, (iii) expresses (L) in terms of brackets of links with 


fewer crossings, and (ii) deletes components without crossings, for example trivial 
links. , 


The fundamental properties of (-) are summed up in: 


Theorem 7.2, For any link L the bracket polynomial (L) is independent of the order 
in which rules (i) to (iii) are applied to the crossings and furthermore is invariant 
under the Reidemeister moves (Il) and (III). 


It is important to note: 


Proposition 7.3. The bracket polynomial is not invariant under Reidemeister move 


(1). 
Proof. Apply move (I) to ae 


(R)=AQh) +4- Kh) = AMAA SY AGL) =A). 0 


Oriented links and the Jones polynomial 


Suppose now that the link L is oriented, by which we mean that each of its com- 
ponents is assigned an arrow representing a direction of motion along the given 
component. Once the orientation has been given we may assign a sign to each 
crossing of the link diagram by the rule displayed in fig. 7.3. 


x x 
‘ / 
+ve crossing —ve crossing 


Figure 7.3. 


A crossing is positive if the over arc at the crossing is on the left as one ap- 
proaches the crossing in the direction of the two indirected arcs. 

The writhe of an oriented link L is the sum of the signs at the crossings of L 
and is denoted by w(L) (see fig. 7.4). The writhe is not an invariant of a link since 
it changes by +1 under the type I Reidemeister move. However, it turns out that 


if we combine the writhe with the bracket polynomial of Kauffman we get a link 
invariant. 
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Figure 7.4. An oriented link with writhe w(L) = 1. 


+ 


Theorem 7.4. For an oriented link L, the function 
Vit) = fie) 


where 


f(A) = (~A*P (L) 


is an invariant of the link. 
V_(t) is called the Jones polynomial of the oriented link L. 


From knots to signed graphs 


Now given any unoriented link L there is a natural way in which to associate with 
it a signed graph G(L), constructed as follows. 

Two-colour the faces of the link diagram of L with colours black and white 
in such a way that adjacent faces have different colours. (This is possible since 
L, regarded as a graph, is planar and regular of degree 4, hence Eulerian. Thus 
its dual graph is bipartite.) By convention let the unbounded face be white. Let 
G(L) have vertices corresponding to the black faces and join two vertices of G(L) 
when the corresponding faces are the opposite faces of a crossing. The sign of the 
crossing is defined by the convention illustrated in fig. 7.5. This is that a crossing is 
positive when viewed along the edge joining the two black faces the edge passing 
over at the crossing is on the left; otherwise it ts negative. 

In general the graph G(L) will have both positive and negative edges but there 
is an important class of knots (links), called alternating links, for which G(L) has 
edges of only one sign. 


fi) 


b Figure 7.5. 
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A link diagram J. is alternating if the crossings alternate under—-over—under~ 
over... as the link is traversed. It is very easy to verify. A link is alternating if it 
has some link diagram that is alternating. 


Lemma 7.5. A link diagram L is alternating iff the edges of the associated signed 
graph G(L) are all positive or all negative. 


Alternating links; the Jones, bracket and Tutte polynomial 


In general the bracket polynomial and Jones polynomial of a link will not be 
completely specified by the graph G(L), but will also depend on the signs associated 
with the edges of G(L). However, in the important case where L is an alternating 
link diagram, we saw above that all edges of G(L) have the same sign and hence 
essentially the undirected graph G(L) determines L. 

Now consider the fundamental bracket equation or skein diagram (as it is called 
in knot theory), 


(Z)-(M)-« (a) 


In terms of the associated graph G(L), this can be translated to 


(L2)-4(2 )ear(s ¢) 


which is exactly the contract/delete formulation used in the definition of the Tutte 
polynomial of a graph and matroid (see chapter 9). 
It is not surprising therefore that we have the following fundamental relationship. 


Theorem 7.6. If L is an oriented alternating link diagram and G denotes its associ- 
ated unsigned “black-face” graph then the Jones polynomial V(t) is given by 


Vile) = (CA O27(G; 1-071) 
where T(G;x,y) is the Tutte polynomial of G. 


In other words, the Jones polynomial of an alternating link diagram is given by 
the evaluation of the Tutte polynomial of its associated black-face graph along the 
hyperbola xy = 1, up to an easily derived factor. 


A proof of a conjecture by Tait 


As an example of the use of these knot polynomials we show a very simple proof 
of a longstanding conjecture by Tait (1898). 

Consider first the knot K of fig. 7.6. The crossing joining AB is called an isthmus 
because in the associated link diagram the edge AB will be an isthmus in the usual 
graph-theoretic sense. A link diagram is reduced if it has no crossing which is an 
isthmus. 
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Figure 7.6. 


It is easy to see that the knot represented by fig. 7.6 is equivalent to the alter- 
nating knot K' obtained by “twisting one component and removing the isthmus 
crossing AB”. 

The key fact we shall need is the following, obtained by considering the repre- 
sentation of (K) in terms of the Tutte polynomial of G(K): 


Proposition 7.7, Let L be a reduced alternating diagram, then in the bracket poly- 
nomial (K) 


max degree(K) = V +2(W — 1), 
min degree(K) = —V —2(B —1), 


where V is the number of regions and W and B are the numbers of white and black 
regions respectively in the shaded graph G(L). 


We now state and prove Tait’s conjecture. 


Theorem 7.8 (Murasugi 1987, Thistlethwaite 1987). The number of crossings in a 
reduced alternating projection of a link L is a topological invariant of L. 


Expressed more informally, this means that if we have a link diagram which 
is alternating and contains no isthmuses and has 7 crossings then we know that 
there is no other reduced alternating link diagram representing the same knot and 
having a different number of crossings. 


Proof. Let span() denote the difference between the maximum and minimum 
degrees of A in the bracket polynomial (L). By Theorem 7.7 (with the notation 
introduced there) we have 


max degree(K) = V +2(W — 1), 
min degree(K) = —V — 2(B — 1). 


Thus span(L) = 2V + 2(W + B — 2) and since W + B = V +2 (Euler’s formula for 
planar graphs) we have span(L) =4V. But f(L) =a~°4)(L) is a topological in- 
variant of links, thus span(L) is also, and hence V, the number of crossings in an 
alternating presentation, is also a topological invariant. O 
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1. Introduction 


If pressed hard to give a rigorous classification of the subject matter, ] would say 
that infinite combinatorics is part of set theory and hence a part of mathematical 
logic. 

On the other hand, the problems and ideas of this topic are clearly of 
combinatorial character. Quite a fey of the basic questions arose by simply 
replacing integer parameters by infinifé cardinals in finite problems. It is however 
a fact that most of the good problems lead to situations which can only be handled 
adequately by the methods of logic developed in the last three decades: forcing, 
large cardinals, constructibility and its generalizations. 

As to the selection of the material presented in this chapter there is clearly no 
way to achieve completeness. Infinite combinatorics could fill another book or 
two. | will try to illustrate the phenomena described above on a few carefully 
selected problems, which I hope will all sound familiar and natural to experts of 
finite combinatorics. I will prove some results of combinatorial character on these 
problems and then state the independence results required to fill the gaps in our 
knowledge. 

I would like to mention that a chapter describing infinite combinatorics written 
more in the spirit of set theory appeared in the Handbook of Mathematical Logic 
(see Kunen 1977). 

] am not going to present a specific axiom system of set theory, but it should be 
emphasized that anything we will do can be done in the well-known Zermelo— 
Fraenkel axiom system of set theory with the axiom of choice. If we use any other 
set-theoretical assumptions, these will be stated explicitly. I will list some 
notation, elementary facts and theorems without proofs in the Appendix. These, | 
hope, will be mostly self-explanatory and (or) well known to most of the readers. 
Still I will give reference to these results in case I use them in sections 2—6. For a 
detailed development of set theory from the axiom systems, see, e.g., Jech 
(1978). 


2. Infinite graphs and hypergraphs. Analysis of a very simple problem 


Quite a few definitions and results given for finite graphs, digraphs and hy- 
pergraphs, extend to infinite ones, without essential modification. This is precisely 
what we are going to do. If we do not define a concept in the infinite case, or if we 
use a result, valid in the finite case, without any further argument this means that 
the original definition or argument applies without any change. We will only 
consider simple graphs, which we will simply call graphs. 

The first problems arise if we consider some invariant number associated with 
an object. Clearly, in general, we intend to replace the number of elements of a 
set by the cardinality of the set. Infinite cardinals are well-ordered by order of 
magnitude, hence no problem arises if we must take the minimum of certain 
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numbers. So, for example, x(G), x(G), y(G), and 6(G) are defined without any 
change. 

An infinite set of cardinals does not necessarily have a largest member, so for 
example the definition of a(G), #(G), and 4(G) does not extend automatically, 
and there are indeed at least two different ways to define them. For example, we 
could define, for G = (V, E): 


a(G) =sup|{|X|]: X CV, X is astable set in G} . 


This has two advantages. First, for finite graphs it coincides with the old 
definition, and second, it is what everybody would do, to start with. However, 
there is a definite disadvantage, as well. If, say, a(G) =, we do not know if 
there is an infinite stable set, or all stable sets are finite, but unbounded in size. 
We could define a(G) as the smallest cardinal «x such that there is no stable set of 
size xk. This can also be expressed as a supremum, namely 


&(G) = sup{LX]": X CV, X isastable set inG) . 


Now a(G) = a(G)+ 1 in case G [or even a(G)] is finite. a(G) =w if there are 
only finite stable sets, but the size of them is unbounded, and &(G) = @, in case 
the largest independent set is of size X,. So the second possibility is technically 
much better. Not to hurt the feelings of those (too much) who are willing to 
consider countable stable sets, but refuse to think of w,, we will use both 
notations for @, and for any other invariant defined similarly. For example, 
A(G )=w means that the degree of every vertex is finite, but there is no finite 
upper bound. To see this technique work, let us try to find the infinite analogue of 
a very simple theorem: /f G is finite, then x(G) < A(G) + 1 (cf. chapter 4). 


Theorem 2.1. y(G) < A(G) holds for every graph G. 


Proof. Set A(G) = x. Let V= {x_: 8 <a@} be a one-to-one enumeration of V,; we 
will call it a well-ordering of V. We want to define stable sets (V,: € <x) such that 
Wee: V, = V. We apply transfinite recursion. Assume B <a, and that for y <B 
we already decided which V, contains x,, 1.€., there is a function € defined for all 
y <8 with the correspondence é(y) = <x, € V,. Now the degree of x, is less 
than x, hence there is a € <« such that G has no edge {x,,x,} with €(y) = é. Set 
é(8) = € for the minimal é of this kind. By Theorem A.11 (ii), this defines €(B) 
for every B <a, and the sets V, for € <x. Clearly the sets V, are stable and cover 
VY. O 


The question arises if Theorem 2.1 is best possible. We know this for finite x, as 
is shown by K,. Assume « is an infinite limit cardinal. Then, for a G which is the 
disjoint union of K,’s forA <x, A(G) = x(G) = «, hence the result is best possible 
in this case too. 

Assume now A(G) is an infinite successor cardinal « = A'. A(K,.)=A'! and 
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we have no immediate example. In fact; in this case the result can be improved. 
First we prove: 


Lemma 2.2. Assume G is a connected graph and A(G) <« for some k 2w. Then 
VJ) <k. a 


Proof. Let x EV be arbitrary, and, for i<a, let 
V,={y EV: the distance between x and y in G is i}. 


V=U,..,, V,, since G is connected. Now we prove by induction on i<w, that 
\V,|<«. This is true for i=0, since V, = {x}, |V,|=1<«. Assume |V,|<« for 
some i. Since each element of V,,, is adjacent to some element of V,, and 
A(G) <x, it follows that |V,,,| <|V,|-« <«? =«. Hence |V,]<« holds for i<w. 
But then 


\Vi<K-w=k. CJ 


Corollary 2.3. Assume A(G)<« for some x 2w. Then x(G)S«. As a conse- 
quence of this, if A(G)=«* > for some G, then 


x(G) <x <A(G). 


Proof. By Lemma 2.2, each component of G has cardinality at most «. Clearly, V 
is the union of « sects, each having at most one clement in common with each 
component. But all sets of this kind are obviously stable. U 


Now one could argue that the proof of Theorem 2.1 we gave is superfluous. 
Indeed, if A(G) =x >, then Corollary 2.3 obviously implies Theorem 2.1. The 
case A(G) =w easily reduces to the cases A(G)<w. In case A(G) = k <w, we 
know the result if G is finite. The question arises if this fact alone implies the 
result for every G. The answer is affirmative, as stated first in this form by de 
Bruijn and Erd6s (1951), but the proof of this is certainly more involved than the 
proof of Theorem 2.1. The main aim of the next section is to isolate and prove the 
general principles, called compactness arguments, yielding this type of results. 


3. K6nig’s lemma and compactness 


The theorem of Kénig (1927) is probably the earliest result and certainly the 
earliest well-known result, about infinite graphs. To state it, we have to define the 
concept of an infinite path. 

A graph G=(V,E) is a one-way infinite path if there is a well-ordering 
{x,:i<w}=V of type w of the vertex set, such uiat E = {{x,,%,,,}:i<@}. The 
isomorphism type of a one-way infinite path is denoted by P,.. 

A graph G = (V, £) is a two-way infinite path if V= {x,: p © Z} is a one-to-one 


2090 A. Hajnal 


enumeration of the vertex set, and F = Nara pit: PEZ}. The isomorphism 
type of a two-way infinite path is denoted by 2, ,. We now prove KGnig’s lemma. 


Theorem 3.1. Assume G is an infinite connected graph with A(G)<w. Then 
P, CG, i.e., G contains a one-way infinite path. 


Proof. We are going to define a sequence (x,: i<) of vertices, and a sequence 
(V,:i<w) of vertex sets, by recursion on i<w. Let V,=V, and let x, be an 
arbitrary vertex of G. Assume V,)D---DV, and x,€V,,...,x,EV, are already 
defined in such a way that G[V,] is connected and infinite, and. Xx; (EV, for j <i. By 
the assumption A(G) <w, N(x,) is finite. Since G[V,] is connected, each com- 
ponent of G[V,} — x; contains an element of N(x,). It follows that G[V,] ~ x, has 
only finitely many components, hence one of them is infinite. Call one infinite 
repels V,,,, and tet x,,, be an element of N(x,)NV,,,. Then V,,,C 

V,, G[V,,,| is connected and infinite, and x, ¢V,,, for j<i+1. Hence x, and V, 
are defined for all i<@. By the construction, x,#x, for j<i<@ and x; is 
adjacent tox,,, inG. O 


Traditionally, K6nig’s lemma is stated for trees. This does not make any 
difference, since, by Zorn’s lemma, a connected graph has a spanning tree. 

A pair (T, <) is a said to be a set theoretic tree, briefly an S-tree, if < is a partial 
order on T, i.e., irreflexive and transitive on T, and for all x€ 7, the set 
¥={yET: y<?) is well-ordered by <. For x € T, let a,(x) = a(x) = typ &(<); 
T= {x ET: a(x) =a} is the ath level of 7, A(7) = min{a: 7, = 9}. A(T) is 
i the height of T. Clearly T= U,emr) T,- BCT isa branch of T, if B is 
ordered by < and BN7., #9 for all a< A(T). 


Definitions 3.2. Let « =. An S-tree (T, < ) is said to be a « S-tree if h(T) =k, 
and |7,,[<« for a <A(7). 
A cardinal « has the tree property if every « S-tree has a branch. 


It is a much investigated problem to determine which cardinals have the tree 
property. For example, it can be proven that w, does not have the tree property. 
The examples showing this are called Aronszajn trees (sce, e.g., Kunen 1977, p. 
384). We only gave here this definition to reformulate K6nig’s lemma. However, 
our study of the generalizations of Ramsey's theorem will lead us back to this 
property. 


Theorem 3.3. w has the tree property. 


Proof. Let (T, <) be an w S-tree. We may assume it has a root, i-e., |7>] = 1. We 
define a graph G=(7T,E). E consists of pairs {x, y} of the form x€T7,, yE 
T,44.4<y, for i<w. Since |T,|<w for i<w, we cleatly have A(G)<w. Since, 
by the S-tree property, for all y€ 7;,,, there is exactly one x © T, with x < y, the 
existence of an infinite path in G clearly implies the existence of a branch of T. O 
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To see that Theorem 3.3 is really a restatement of Kénig’s lemma, we deduce 
Theorem 3.1 from Theorem 3.3. By the remark made after the proof of Theorem 
3.1 we may assume that we arc given a graph G =(V, E) with A(G)<@ which 
is a tree. Let x be an arbitrary element of V. For each y,z &V, set z<y if and 
only if z 7 y and z fics on the unique path connecting x and y. [t is casy to verify 
that (V, <) is an w S-tree, and every branch of (V, <) is an infinite path of G. 

We arc after a generalization of K6nig’s lemma, which would yield the desired 
compactness results. We already know that the «-tree property is too strong to be 
truce for every x. We give another reformulation of Theorem 3.1. 


Theorem 3.4. For each n<, let A,, be a non-empty finite set. Let A= X je, An 
and B= X,., A; forn€w. Assume we are given a sequence f, EB, forn<w. 
Then there exists an f © A, such that for all i€w there is an nsi<n<a@, with 


fteap ta 


Proof. Let 7,={g © 8B;: there is an n2i with f. [| i=g}. Clearly, 7,C B, for 
i<w. Let T= U,..,, T;. Write g<h & gCh forg,h eT. Since the clements of 
T are functions, (7, <) is an S-tree. Since every section of a g € T belongs to 
T, T, is the ith level of 7, and T;C B, implics that T, is finite; hence T is an w 
S-tree. By Theorem 3.3 T has a branch B. Thus, UB =f is a function, f € A and 
f satisfies the requirements of Theorem 3.4. O 


1 leave it to the reader to sce that Theorem 3.4 just as casily implies Theorem 
3.3. Hence, this is just another restatement of Konig’s lemma. However, this one 
can easily be generalized to the right theorem. 

Let [X]** (respectively [X]*) denote the subscts of X of size less than A (size 
A). The following is called Rado’s selection lemma. 


Theorem 3.5. Let a be an arbitrary index set, let A, be a finite non-empty set for 
every B <a, and let A= X,.,, Ag be the Cartesian product of the A,'s. For each 
V Efa]~”, set By = Xpey Ag, and let f, € By be given. Then there exists a choice 
function f € A such that for all W Ela” there isa WCV Ela] ” such that 


ftw=f,tw. 


Proof. For W €[a]*”, let Ty ={g € B,: there is a V €[a]~”, W CV such that 
ga=f, t W}. Note that Ty C By; hence Ty is finite. Let AY={fEA: f [WE 
Ty}- 

We have to prove that there is an f in the intersection of all Ay’s. Choose A as 
the underlying set, and notice that the system 


F={A,:WE[a]~} 


has the finite intersection property. Indced, let W,,..., W,,_, be finite subsets of 


a. Then V= U,_,, W, is finite. Since the sets A, are non-empty, f, has an 
extension f in A, fy | W,ET,, fori<n, hence fE Ay for i<n. 
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By Theorem A.14, ¥ can be extended to an ultrafilter U in A. For each B<a 
and x€ Ag, let Ag, = {(fEA: f(B) =x}. Clearly, A= Urea, A for each 
B <a, aiid: the aiiiimiands are pairwise disjoint. U being an aivaalter: and A, 
being finite for each B <a, there is a unique f(B)€ Ag such that 


Apna EU - 


We claim that this f€A satisfies the requirements of the theorem. Let WE 
{a]*°. Then, pew As.na) EU, by the finite intersection property of U, and 


Npew Ag.npy I AwEU because of Ay © ¥. This implies that f [| WE Ty, 
hence fE Ay. LF 


Statements implying Rado’s selection lemma were formulated earlier in other 
branches of mathematics, such as Tychonov’s product theorem stating that the 
topological product of compact topological spaces is compact, and Gddel’s 
compactness theorem stating that a set of formulas of a first order language has a 
model, provided every finite subset of these formulas has a model (see, e.g., 
Kunen 1977, p. 10). It is especially easy to deduce the selection lemma from 
Tychonov’s theorem, since the finite sets Ag are compact in the discrete topology, 
and the sets A,, are closed in their product space. It is just as easy for logicians to 
see the derivation from Gédel’s theorem, and also this is the case with the 
applications of the selection lemma. That is why consequences of the selection 
lemma are stated in the literature with the laconic remark “by compactness”, and 
we will follow this practice in the future, but now we really deduce the de 
Bruijn—Erdés theorem, already quoted in section 2, from the selection lemma. 


Theorem 3.6. If y(G') <k <w for every finite subgraph G' of G, then y(G) Sk. 


Proof. Let G = (a, E), and let Ag =k (={0,....k — 1}) for every B <a. Using 
the notation of the selection lemma, let f,, © B, be a good coloring of G[V] for 
VE[a]”’. Let f be the function given by the selection lemma. Clearly, f is a 
k-coloring of G. f is a good coloring of G, since for every edge e = {B, y} there is 
a V De, such that f | e=f, Je, and hence f(y) =f,(v) #f(B) =f(B). 


The question arises whether or not the selection lemma can be true for 


amatatpanesornn CATA T” with the word finite replaced by <x everywhere. This 
large cardinals. Such a cardinal x is called strongly 
is immensely large. We will briefly speak about large 
> | cardinals are not strongly compact, but the problem 


ces (or of proving some weak forms) remains. This is 
iy illustrate it by stating a few results concerning the 
jroved in Erdés and Hajnal (1968) that there exist 
ith y(G) =x", such that all subgraphs of cardinality 
mber at most x, for x =w. We will prove this in 

in the constructible universe, for every regular x, 


~~ 


qq ~“ 
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there is a graph of size « which is 8, chromatic and all its subgraphs of size <« 
are <X, chromatic if and only if « does not have the tree property. 

Finally, we mention the problem, due to Erdés and Hajnal, of the chromatic 
number jumping two or more cardinals. We state a special case. 


Question 3.7. Assume the continuum hypothesis (see the Appendix). Does there 
exist a graph G with &, vertices and y(G) = &, such that all subgraphs of size <X, 
have chromatic number at most X,? 


[t was proved in Baumgartner (1984) that an affirmative answer to this problem 
is consistent, and Foreman and Laver (1988) proved that it is consistent, relative 
to the consistency of a very large cardinal, that the answer is negative. If in 
Problem 3.7 &, can be replaced by &,, or by anything greater than 8, and smaller 
than the first « > w having the tree property, the problem seems to be hopelessy 
difficult at present. 


Note added in proof. Shelah (1990) proved that if V= L and « =cf(x) > w is not 
weakly compact then there is a graph G on « with y(G)=« such that the 
chromatic number of every subgraph of size less than « is No. 


4. Ramsey’s theorem and its generalizations to larger cardinals 


In this section we are going to investigate r-partitions of length y of an arbitrary 
set X. That means we are going to consider mappings f : [X]’— y. Clearly, such a 
mapping corresponds canonically to an edge disjoint partition U,_, H, of [X]' 
into r-uniform hypergraphs, indexed by ordinals a <y; namely 


fe)=a & eEH, fora<y, e€[X]’. 


Since most of the questions we ask depend only on the cardinality of the 
underlying set, we usually choose X to be a cardinal. The cliques of H,, are called 
homogeneous for f in the color a. It was proved in Ramsey (1930) that: 


Theorem 4.1. For ail r partitions f of length k<w of an infinite set there is an 
infinite set homogeneous for f. 


The discovery of this result had an enormous impact on both combinatorics and 
set theory, as can be seen, e.g., from the two monographs Graham et al. (1980), 
and Erdés et al. (1984). Referring the reader to these sources, we intend to keep 
our account of infinite Ramsey theory within limits, concentrating on results we 
are going to use in other sections of this chapter. 

To have a concise notation for the possible gencralizations, we define the 
so-called ordinary partition symbol, introduced in Erdés and Rado (1956). 
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Definition 4.2. « — (x,)/, -, means that for all r-partitions f of length y of « there 
are @ and a set X, of size «,, homogeneous for f in color a. 


K 74 (k,)q<y denotes the negation of this statement. If all x, are equal to A, we 
use the notation « —(A)}. 

Other, mostly self-explanatory variants of this notation will be introduced later. 
The relation in Definition 4.2 remains true if « is increased or if the variables on 
the right-hand side are decreased. This was the motivation behind separating 
them with the “arrow” symbol. Ramsey’s theorem states, in this notation, as 


w—>(w), forr,k<w, 


and the Ramsey function R,(/,,...,/,_,), used in finite combinatorics (see 
chapter 25), is the smallest integer R, for which 

R-> lien 
holds. 


The reader is advised to check that the existence of the finite Ramsey function 
follows from Ramsey’s theorem w — (w),, by compactness (i.e., Theorem 3.5). 

We now intend to describe a proof of Theorem 4.1, which will be useful for us 
for other purposes as well. We need some preliminaries. We now assume r = 2. 

Let f: [«x]’— y be an r-partition of length y. Note that the underlying set « has 
a natural well-ordering. If A, B are subsets of x, we write A< B if a<b for 
a€&A and bEB. A subset X Cx is prehomogeneous for f if the color of an 
r-tuple in X does not depend on its last element, i.e., for e&[X]'"', a, BEX 
with e<{a, B} imply that 


fe V {a}) =fle UV (B}). 


If ACK, A< {a, B} Cx, we say that @ and B are twins over A for f if for every 
e€[A]’' 


fle U {a}) = fle U {B}). 


We write a = B (A, f) in this case. Note that if |A|<r—1, a = B (A, f) always 
holds for A < {a, B}, and if A and y are finite, then the number of equivalence 
classes is finite. 

Now we can associate a sequence (8%: »<¢,) to each O0<a@ <« as follows. We 
define the sequence B% by recursion on »v. First, By =0. Assume (Bi: & <v) is 
defined for w<v. Set {Bi : w<v} = BY. Now if there is a B with BY <{B}< 
{a} which is a twin of @ over B* for f, we write BY for the minimal B of this kind; 
if not we stop, B*% is not defined, and we set , = v. Finally, write B <,;a@ or 
B <a, for short, if B occurs in the sequence associated to a, i.e., if B = B) for 
some v < ¢,. After these definitions, it is a matter of easy computation to prove: 


Lemma 4.3. B <,a@ implies B <a, (k, <,) is an S-tree of height <«; {a: 6, = v} 
is the vth level of this S-tree; every chain of this S-tree is prehomogeneous for f. 


TL LT AT 
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The tree (x, <,) is called the canonical partition tree associated with f. 
The use of prehomogeneous sets should be obvious from the following lemma. 


Lemma 4.4, Let f: [x|'— y. Assume X Cx is prehomogeneous for f, |X|= A,X 
does not have a largest element-and A—> (k, Wea Then there is a homogeneous set 
of size «x, for f in color a, ‘for some a <-+. 


Proof. The r-partition f induces an (r—1)-partition f’ of X, by the pre- 
homogeneity. There is a homogeneous set Y for f’ of size x, in some color a, by 
the assumption. Y is homogeneous for f as well. O 


Now we can easily prove Ramsey’s theorem. We use induction on r. For 
r=1,w—(), is Dirichlet’s box principle. Assume r>1, o> (@),', and let 
f:[w]'— k be an r-partition of length k. By Lemma 4.3, (w, <,) is an S-tree. We 
want to see that it is an w S-tree. Since @, <n for each n<w, the height of this 
tree is w. We have to show that each level is finite. This is done by induction. Let 
T, be the ith level. We have 7|, = {0}. Assume that T,,..., 7; are finite. Now, 
for an a €T,,,, there are only finitely many possibilities for (8; : j <i), since 
6; E€7,, and among infinitely many a@’s with the same sct of predecessors 
B,,, = B%,,, there are two which are twins over B,,, for f. Hence, T;,, must be 
finite as well. It now follows, from the form 3.3 of K6nig’s lemma, that (w, <;,) 
has a branch B. B is clearly infinite and, by Lemma 4.3, it is prehomogeneous for 
f. Then, by the inductive assumption and by Lemma 4.4, there is an infinite 
homogeneous set for f.§ O 


We deduced Ramsey’s theorem from KG6nig’s lemma and indeed, the two 
statements are equally strong. Instead of pondering about the exact logical 
meaning of this claim, we remark that, indeed, we proved a much stronger result 
of Erd6és and Tarski (1943). 


Theorem 4.5, Assume x is a strongly inaccessible cardinal which has the tree 
property. Then 


K—> (x), 


holds fory<k. 


Indeed, in the proof of Ramsey’s theorem, when establishing that (w, <,) is an 
w S-tree, we only used that w is strongly inaccessible. 

The paper by Erdés and Tarski (1943) initiated the modern theory of large 
cardinals. There it was recognized that the analogue of Ramsey’s theorem holds 
for strongly inaccessible cardinals having the tree property, but fails for cardinals 
not strongly inaccessible. This results from the following facts: 


Kk A(x, cf(k)')? and 2° 74(«')> fork =o, (4.6) 


\ 


\96 A. Hajnal 


since every cardinal A not strongly inaccessible is either singular, or for some 
kK <A, 2" =A holds. Now the first fact stated in (4.6) is quite obvious, while the 
proof of the second is based on the following idea. Consider “2, i.e., the set of 
0, 1 sequences of length «, order its elements lexicographically, and let <,,, be 
this ordering. Prove an easy technical lemma, saying that every increasingly or 
decreasingly well-ordered set in the <,,, ordering, has cardinality at most «. Also 
fix a well-ordering < of “2 and define the two graphs G, and G,, where G, 
consists of the pairs parallel in the orderings <,,, and <, and G, of the rest of the 
_ pairs. This is a generalization of a proof of Sierpitiski, where the ordering and a 
well-ordering of R are compared similarly, and which gave the special case 


2° (N,)3- 


We turn back to the line of thought of the paper by Erd6és and Tarski in the 
next section. We now prove the Erdés—Rado theorem, which gives the existence 
of the Ramsey function for infinite cardinals, and even a quantitative bound. First 
we consider the case r= 2. 


Theorem 4.7. For « > w, (2")* > ((2")*, (« ‘),)’, and as an obvious corollary of 
this, (2")' +(x‘), holds. 


Proof. This is a special case of a more general result, and here we can avoid the 
use of the canonical partition tree. Set A = (2")*. Let f: [A]? «. We assume that 
there is no homogeneous set of size x* for f in the colors », 1 <v<«. We will 
prove that there is a homogeneous set of size A for f in color 0. 

Let A = {a <A: cf(a) =x "}. Considering that x * <2" <A, by Fact A.19, this 
is a Stationary subset of A. For each a€ A and 1<v<k, let A, be a subset of 
a +1 (=a U {a}), maximal with respect to the property that it contains @, and it 
is homogencous for f in the color ». Write each A,, in the form B, , U {a}, 
where B,,Ca. Let B,=U,.,-, B,,,. By assumption, |B, ,|<«, and hence 
|B,| <x. Let g(a) = sup B,, for a © A. Then, by cf(a)=«", g(a)<a@ foraG A, 
hence g is regressive on A. By Theorem A. 18, there is a stationary subset A, C A 
and an ordinal € <A, such that g(a) = — for aG Ay. Considering that € <A, we 
have |€]<2*, and so |é|" <2" <2" <A. Therefore, there is a stationary subset 
A, GA, such that for a, BE A, and for all 1 <» <x, B,, = B,, holds. We claim 
that A, is homogeneous for f in color 0. Indeed, if f({a, B})=v >0 for a pair 
a, BEA,, and say B<a, then a+1DB,,U{B,a}DB,,U {a}, and BU 
{B, a} is homogeneous for f in color v, a contradiction. O 


Now we give the Erdés-Rado theorem for r-uniform hypergraphs. Let 
expy(k) =«, and exp,,,(k)=2°°"", for r<w. Then 


Theorem 4.8. (exp,_,(k))" > («'). forr2=1, «20. 


eae I . . : : . 
Proof. For r= 1, this is x ‘> («),, which is truc, since x" is a regular cardinal. 
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We use induction. Assume the statement is truce for r. To prove the statement for 
r+1, by Lemma 4.4, it is sufficient to prove the following stepping-up lemma. 


Lemma 4. 9. Let A> a, and let f : ((2")* |'*'—»2°. Then there’ is a subset X C (2*)* 
of size A” such that f is prehomogeneous on X. 


Proof of Lemma 4.9, Again, this is not the strongest possible statement, so we 
can avoid the use of the canonical partition tree. Set p=(2*)", and tet 
f:[pl’*'>2*. Again let A={a<p:cf(a)=A"}. Just as in the proof of 
Theorem 4.7, A is a Stationary subset of p. For each a € A, let A, be a maximal 
subset of a +1 containing a@ and prehomogeneous for f. A, is of the form 
B, U {a}, where B, Ca. If |B,|=A° for some a € A, we are done. So we may 
assume instead that |B,| <A for each a € A. Put g(a) = sup B,. Then g(a) <a for 
a €A, since cf(a) =A‘. It follows from Theorem A.18 that there are <A and 
an AyCA, |A,| =p, such that g(a) = & for a € Ay. Since <p implies |é|*< 
2*<p, there are BD& and an A, C Ag, |A,|=p, such that B, = B for aE A). 
The number of equivalence classes for a =B (B, f), is at most (2 ayiat’ = 
Since |A,| =p > 2”, there are a #8 € A, such that a and £ are twins for f over 
B. \f say B <a, then B U (, a} is still a subset of a + 1 prehomogencous for f, a 
contradiction. 


Note that Theorem 4.7 is stronger than the instance r= 2 of Lemma 4.9. The 
last aim of this section is to convince the reader that the seemingly enormous 
bound given by Theorem 4.8 is best possible. The following is a result of Erdés et 
al. (1965). 


Theorem 4.10. exp, _,(k)75(«*)5 for « 2w and r 22. 


For r= 2, this is 2" 74 (x* )3. We: stated this in (4.6). I will only outline the 
proof for r =3, i.e., the proof of 27° 7 («*)3. Let A= 2", and S =*2, the set of 
0, 1 sequences of length A. For f#g ES, let 5(f, g) = min{or <A: f(a) 4 g(a)}, 
the so-called first discrepancy of f and g. The lexicographical ordering <,,, of S is 
defined by f<,.,g¢ @& /f(S(f, g))=0. Let < be a well-ordering of S. Let 
: [A] 2 be a 2-partition of A = 2" establishing A 74 (x *)3. Let F= {f,, fi, fh}, 
fo<f, <f;, be an arbitrary triple of S. First, we split the set of the triples of S into 
four parts: 


Ky= {F: fo Sex f tex” f} ’ K, = {F: fo tex? Sj <tex fi} * 

K, = {F: So <tex fi tex fi} > kK, a {F: fo ex Si tex” fi} : 
Let K=K,UK, be the class of “smooth” triples. For FE K, let A(F)= 
{6(fo. fi), aU, rae it is easy to sce that A(F) has two elements for F € K, and 
so A(F)€{A]’. Now we can define a partition W: [S]}>>2. Let W(F)=0 for 


FEK,, and ¥(F)=1 for FE K,. For FE K put W(F) = d(A(F)). 
To see that thts works, we need two sublemmas. 
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Lemma. (i) Assume that for some X CS and 7 = w, |X|=7 and [X]' NK, =9 for 
some i <2. Then there is a YC X.|¥|=7, such that [Y}'C K, or [Y)C Ky. 

(ii) Assume X CS, |X| =7 is regular, and < coincides with <,,, OF \.,> on X. 
Then there is a sequence { f,: v<+} CX, and a sequence {5,: v<+} © A, such that 
5( ff) =5, for all p<v<r. 


I leave the proofs to the reader. Assume now that X C S is homogeneous for ¥, 
say. in class i <2, and |X| =«'. By (i) we may assume that <,., or ,,,> coincides 
with < on X; hence all triples of X are from K. We may assume that XY = {f,: » < 
«’} is increasing in the well-ordcring, and satisfies (ii) with a sequence {5,: » < 
k"}. Let p<. Then {6,,5,}) =AC{f,f.f41}), and hence $({5,,6,}) =i, and 
the set {5,: »<x«‘} is homogeneous for @ in the color i<2, a contradiction to 
the choice of @. O 


§. Combinatorics and the discovery of large cardinals 


We give the definition of inaccessible cardinals in the Appendix. Let us define 
now the so-called cumulative hierarchy of sets. First, V, =. Then, in general, 
V+, =the set of all subsets of V,,V, =U, . Vg for a limit ordinal a. It was J. 
von Neumann, who gave the first exact proof that we may assume (it is relatively 
consistent) that the class of all sets V consists of LU, an ordinal Va but the fact that 
the sets V,, for x >w strongly inaccessible, are natural models for the axioms of 
set theory, was already known to Zermelo. The assumption that such cardinals 
exist seemed to be a natural extension of the axiom-system, working in the way to 
capture “Cantor's absolute’’, whatever that might mean. But even with a more 
modest and realistic approach if, as it was clear, set theory is not sufficient to 
decide within it all problems of mathematics, any natural strengthening of the 
axiom-system should be welcome. Quite early in the game, Mahlo (1911) 
suggested further large cardinal assumptions. To understand these let us introduce 
some strengthenings step by step. First we may assume that for every A there is a 
« >2X that is strongly inaccessible. Then we can arrange these inaccessible 
cardinals in an increasing sequence x), and wonder if there is a “fixed point”, 
i.e., if there is one with x) = a@? Now assume first there is one, and then assume 
there is one greater than A for every A. Denote the increasing sequence of those 
by «2, assume there is one with «2 =a, and iterate this transfinitely. The 
observation of Mahlo, in present day terminology, was that none of these yield 
the existence of a « such that 


« is strongly inaccessible, and the set of cardinals A<« (5.1) 


which are strongly inaccessible is stationary in x. 

He suggested assuming that cardinals satisfying (5.1) exist. These are nowadays 

called Mahlo cardinals, or 1-Mahlo cardinals. Of course, there is an immediate 

possibility to iterate even this transfinitely, e.g., « is a 2-Mahlo cardinal if 
{A<x:A is Mahlo} is stationary in «, and so on. 
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Though this approach was not particularly fruitful in 1911, Mahlo cardinals play 
an important role in present day set theory. Let us remark that, similarly to the 
case of inaccessible cardinals, itis very easy to show the consistency that they do 
not exist. Assuming that there is a x, for which, say, (5.1) holds, then there is a 
smallest onc, say K,, and it is casy to see that V,, is a model, in which there is no « 
satisfying (5.1). The situation is similar with all the large cardinal assumptions 
considered in this section. . 

Another problem, investigated by the Polish set theory. school in the 1930s, was 
the measure problem, raised by Banach in Banach and Kuratowski (1929). For 
what cardinals « does there exist a probability measure on all subsets of «, 
vanishing on one-point sets? Ulam (1930) proved that any such cardinal must be 
weakly inaccessible. Soon it was discovered that the core of this problem is to 
decide the question in case when the measure can take only the values 0 and 1, 
and Tarski proved that a cardinal « carrying such a measure must be strongly 
inaccessible. In the paper by Erdés and Tarski (1943) where they investigated a 
problem in the theory of partially ordered sets, they obtained an answer involving 
inaccessible cardinals, and in the remarks they formulated the basic connections 
about different properties of large cardinals. 


Definition 5.2. An infinite cardinal « is measurable, if there is a free ultrafilter U 


on the subsets of « which is x-complete, i.e., the intersection of fewer than « sets 
in U is in U. 


Note that, by Theorem A.14, w is measurable. By defining 


W(X)=1 2 XEU, 


we see how an ultrafilter corresponds to a 0-1 measure defined on all subsets of x. 

Erdos and Tarski proved that if a strongly inaccessible « has the tree property, 
then «~->(x)), holds for r<mw and y <w. We proved this in Theorem 4.5, and 
they proved: 


Theorem 5.3. If « is measurable, then « has the tree property. 


Proof. Let (T, <) be a « S-tree. We know that |7| = «. Let U be an ultrafilter on 
T satisfying the condition in Definition 5.2. For each x € T, let ¥={yET: xx 
y}. Clearly, for each a<x, T= Uy. Tg UU {¥:x€T,}. Since |T,|<« for 
each a <x, there is exactly one x, ET, with ¥,EU. Since ¥,9X,EU for 
a,B<x,x, and x, have a common upper bound, and therefore x, and x, are 
comparable in T; hence {x,: a<«} is a branch of T) O 


Unfortunately, Erdés and Tarski guessed wrong. They speculated that it might 
be consistent with the axioms of set theory that all strongly inaccessible cardinals 
are measurable. It took eighteen more years to discover that this is not so. In 
1960, Hanf, then a student of Tarski, proved that a compactness property, similar 
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to that formulated in Theorem 3.4 for @, docs not hotd for «, if « is the first 
strongly inaccessible cardinal. He used languages with infinitely long formulas, 
and proved that a weak form of Gédels compactness theorem docs not hold for 
these languages. He also proved (see Hanf 1964) that the Ramsey property is 
really equivalent to the K6nig property, i.e., 


Theorem 5.4. If for some x =w x—>(k)5, holds, then « has the tree property. 


We postpone the proof, and first we finish the historical remarks. After 
Theorem 5.4 was proved, strongly inaccessible cardinals having the tree property 
were called weakly compact cardinals, and their class is called Cy, C, C Cy is the 
class of measurable cardinals, C, CC, is the class of strongly compact cardinals 
(see our definition after Theorem 3.6). In Kiesler and Tarski (1964) an attempt is 
made, to transform the logic proofs to a purely set-theoretic proof of the fact that 
cardinals in C, are very large, e.g., x € C, is a-Mahlo for all a < x. However, all 
known proofs bear some trace of logic in them. 

It is easicr to devise direct combinatorial proofs that the first strongly 
inaccessible cardinal greater than w, and the cardinals in the small Mahlo classes, 
arc not weakly compact. After all, if there is a graph G on «x, the first strongly 
inaccessible cardinal >, with a(G) and w(G) <x, we would like to see one. We 
will try to do our best in Theorem 5.6. First we prove Theorem 5.4. 


Proof of Theorem 5.4. Assume (T,~<) is a « S-tree; let T,:a@<x« denote its 
levels. We now describe a procedure, called the squashing of the tree, which 
extends the partial order < of T to a total order <*. For x GT and B < a(x), let 
X, be the unique element y of T with yE7, and y<x. For x, yET, let 
6(x, y) = min{B: x, # yg} if x and y are incomparable. <* is very similar to a 
lexicographic order. For cach a <« choose an arbitrary ordering <, of the level 
T,, and put 


x<*y © eitherx< or x and y are incomparable and 
y y y Pi 


for a = 6(x, y) x, y, - 
It is easy to check that <* is a total order of 7, and that 


x <* y implies that 


x, ty, forex), aly)=a. 


Now let <, be a well-ordering of 7. Using the assumption «x —(k)3, it now 
follows from the idea of Sierpinski, described in the proof of (4.6), that there 
exists a set SCT, |S|=T, which is either increasingly or decreasingly well- 
ordered in the ordering <*. We may assume that (S, <*) is well-ordered and of 
order-type x. We know from (4.6) that « is regular. It follows now from (5.5) that 
there is an element x“ of T, such that y=x* holds, for all but fewer than x 
elements y of S. Then {x*: a@<x«x} is a branch of (7,<). O 
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We now prove: 
Theorem 5.6. Let « > w be the first strongly inaccessible cardinal. Then x 74 (x);. 


Proof. By Theorem 5.4 we only have to prove that x does not have the tree 
property. Let B denote the set of infinite strong limit cardinals <x, i.c., 
B= {\<x: A isa limit cardinal, and for all 7<A, 2” <A holds}. It is very easy to 
see, using that « is strongly inaccessible, that B is a club set in «. We will need the 
following sublemma. : 


Lemma 5.6’. For © B there is a one-to-one regressive function on BOA. 


Using this lemma we can easily finish the proof. Let T be the set of functions f, 
such that Dom(f) = AN B for some A € B, and f is one-to-one and regressive on 
ANB. Define < on T, by f<g © f Cg. (7, <) is clearly an S-tree. We can see 
what the ath level of this tree is. If {A,:a<«} is an increasing enumeration of 
the elements of B, then T,={fET: D(f)=A, NB). It ts also clear that 
[T,| <Ad*=2°* =2* <x, for a<«. The sublemma 5.6’ tells us that 7, #9 for 
a <x, and hence the height of T is x and (T, <) is a k S-tree. On the other hand, 
(7, <) has no branch, since if B is a branch, then F= U{f: f © B} would be a 
one-to-one regressive function on B. By Theorem A.18 this contradicts the fact 
that B is stationary. Hence « does not have the tree property. O 


Before proving Lemma 5.6’, we prove the following: 


Lemma 5.6". Assume 7 © B, f. is a one-to-one regressive function on B71, and 
p<v7 is acardinal. Then there is a function f! having the same property, and such 
that, for p' <a €B, f'i(a)<p* implies that fi(o)>p and {fi(a): cE BN 
[p*,7)} omits a subset of size p* from (p, p’*). 


Proof. Indeed, |(p, p*)| =p", by Theorem A.7, and f. being one-to-one, implies 
there are at most p' among the o 2p in B with f(c)<p'*. We only have to 
change the values of f, for these a, and this can obviously be done. © 


Proof of Lemma 5.6’. By transfinite induction on A. Lemma 5.6’ is obvious for the 
minima! clement of B which is w. Assume A > w, and that there is an f,: regressive 
and one-to-one on B M7 for every 7 <A, 7 © B. Now, if A is not a limit point of B, 
then B being closed, implies BMA has a largest element 7. By Lemma 5.6", we 
change f, so that its range omits one ordinal <7, we define f{(7) =€ and 
f,(o) =f(o) for o € rN B. Hence we may assume A= sup BNA. Since « is the 
first inaccessible >, A is singular, cf(A) <A. There is an increasing continuous 
sequence {A,: € <cf(A)} of cardinals <A with sup{A,: € <cf(A)} =A and Ay = 
cf{A). Continuity means that A, =sup{A,:7<€} for &<cf(A) if € is a limit 
ordinal. By continuity A= Ay UU, ene) [Ag Ags). Define, by Lemma 5.6’, h, ae 
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for <A, so that A, <h, (7) for Ap <T<A,;,, and (A, \A,)\Ran(f,) has 
cardinality Aj, and A,\Ranl f,, ) is fOr empty. Define f,(7) = hy (7) for 7 < hy 
f,(Ao) © Ay\Ran(f,, ). AD = =h, 7) for a g<T<Agi and extend f, to fA: 1 

€ <cf(A)} using a one-to-one function mapping this set into (A, a) Banh ). ao 


The proof of Theorem 5.6 extends to all x, for which there are sets of cardinals 
<x, ACB, A stationary in «, B closed in «, and for all AG B, A regular, AN A is 
nonstationary in A. This is easily seen for cardinals « in the first w levels of the 
Mahlo hierarchy. The proof, of course, comes from general reflection principles, 
valid for weakly compact cardinals. 

To take up the thread of the tale again, large cardinals were also approached 
through another combinatorial problem. In Erd6és and Rado (1956) the following 
problem was considered. Let f.: [«]’/— y be a sequence of r-partitions of length y 
of «, for rE N. For what cardinals A does there exist a set X C x, |X| =A, which is 
simultaneously homogeneous for all these partitions. They briefly denoted this 
statement by x—>(A),“, and the negation of it by «7(A)>°. They proved 
2° A (w);” for the first attempt. Later, in Erdés and Hajnal (1958) it was proved 
that A 74 (w);“ holds for all A less than the first uncountable strongly inaccessible 
cardinal, and more importantly, it was proved that: 


Theorem 5.7. For « >w measurable, x —(«),” holds for y <x. 


We intend to prove this theorem soon, but first we make some remarks. 
k->(x)>° is the same as « > (x),” for every y <«. Cardinals with this property 
are called Ramsey cardinals; cardinals with the weaker property k >(@,);” are 
called Erdés cardinals. One important property of the measurable and generally 
the large cardinals is that they change the small sets of the universe as well. The 
set of reals will have different concrete mathematical properties, if we assume the 
existence of different large cardinals. The earliest results of this type were proved 
through these combinatorial properties. It was proved in Scott (1961) that the 
existence of a measurable cardinal contradicts Gédel’s axiom of constructibility. 
Rowbottom (1971) proved that the existence of an Erdés cardinal implies that 
there are only countably many constructible reals, and Solovay (1969) proved that 
the existence of an Erdés cardinal implies that every 1} set of reals (analytic 
complement) is either countable or has a perfect subset. The deepest results for 
k —(A);° were proved in Silver (1966). Here it is proved that the first cardinal « 
with «—>(w),” is very large, in fact, larger than the first weakly compact 
cardinal, but it still can exist in the constructible universe. Also much stronger 
conclusions are drawn from the existence of an Erd6és cardinal than the ones 
mentioned before. However, lacking space we cannot go into all that. 

For the proof of Theorem 5.7 we will need a concept introduced by Scott. 

Let « >w be a regular cardinal, and let U be an ultrafilter on the subsets of x. 
U is said to be normal if it is free and for every AGU and for every regressive 
function f on A, f is constant almost everywhere, i.e., there area BC A, BEU, 
and € <x, such that f(a) =€ fora€B. 
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Lemma 5.8. If x > w is measurable, then x carries a k-complete normal ultrafilter. 


Proof: (Outline). Let V be a x-complete free ultrafilter on «. Let S=“«. For 
f,gES, define f<g & {E<x: f(E)<g(&)} EV. There is no decreasing se- 
quence f,>--->/f, >---+. Indéed, otherwise, for n<a, 


A, ={E <K: fir (BE) <L(E EV - 


Hence (7... 4, €V, but then, for some €E1),.,, A,, fol€) > °° > h(E) > °°: 
would be a decreasing sequence of ordinals. We write f=g(V) in case 
{E<K: f(E) =g(€)} EV, and & for the constant € function. 

Let S*={fES: f #€ for any €<«}. Then S* is non-empty, as is shown by 
the existence of the identity function. By the well-foundedness of <, S* has a 
<-minimal element f,. This means that if g(€)</f,(€) on a set in U, then g=7 
for some y<«. Now let U={ACk: f,'(A)EV}. It is a matter of casy 
computation to see that U is a x-complete normal ultrafilter on x. 


We next need a lemma of Rowbottom (1971). 


Lemma 5.9. Assume U is a x-complete normal ultrafilter on «,rEN, and let 
f:{«]'— y be an r-partition of length y of « for some y <x. Then there is a set 
X EU homogeneous for f. 


Proof. By induction on r. For r=1 the statement follows simply from the 
x-completeness of U. Assume the statement is true for r= 1, and let f: [k]/*'> y 
be an (r + 1)-partition of length y of «. Let f,:[« — (a +1)]’-> y be defined by: 


f,(e)=f({a}Ue) foreE{«|’, {a}<e. 


By the induction hypothesis, there is an ordinal v, and a set Y, € U homogeneous 
for f, in the color v, for a<k. 

Let Y={B<«: BEN)... Y,}. We claim that YEU. Otherwise k\Y = ZE 
U. For B € Z we can define a g(8) <8 so that BE Ya): But then g(8) =a for 
some a <« on a subset Z' C Z, Z' EU, and hence, Z'N Y, =9, a contradiction. 
Now there is a WC Z,W EU, and vy </y such that v, = v for a € W. Clearly, W is 
homogeneous for f in color ». 0 


Lemmas 5.8 and 5.9 easily yield Rowbottom’s proof of the Erdés—Hajnal 
theorem 5.7. 


Proof of Theorem 5.7. Let « >w be measurable, y <x, and let f.: [k]’— y be 
given for r€N. By Lemma 5.8, we can choose a x-complete normal ultrafilter U 
on x. By Lemma 5.9, for each r EN, there is a Y, € U homogeneous for f,. Then 
Y=,en Y,EU is homogeneous for each rEN. O 
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Before finishing this section there is one final point I want to make. First I want 
to illustrate it with an example. I formulate a very innocent looking problem of 
Erdés. Two subsets of w, are called almost disjoint if their intersection is 
countable. It is one of the most elementary theorems of set theory that there is a 
family of size larger than w, of pairwise almost disjoint subsets of w,. 


Problem 5.10. Does there exist a family of size larger than w, of pairwise almost 
disjoint stationary subsets of w,? 


The following results are known. It is not hard to see that an affirmative answer 
is consistent. A yes answer holds in the constructible universe. The consistency of 
a no answer implies the consistency of a measurable cardinal, and is implied by 
the consistency of a supercompact cardinal (this is something even “larger” than 
strongly compact). The exact consistency strength of Problem 5.10 is not known 
(sce Foreman et al. 1988). 

As is shown by this example, an infinite combinatorial problem, not involving 
large cardinals, may be untreatable without them. The moral of this is that there 
is no infinite combinatorics without large cardinals. 


6. The coloring number and the chromatic number of infinite graphs 


As we have already mentioned, we only consider simple graphs. The concept of 
the coloring number of a graph was introduced in Erd6és and Hajnal (1966). Let < 
be an ordering of V. < is a x-ordering for G if for every vertex x EV the set 
N(<, x), ie., the set of neighbors of x smaller than x, has cardinality less than x. 
The coloring number of G, denoted by Col(G), is the smallest cardinal x for 
which there exists a «-well-ordering of G. Clearly, Col(G) < A(G), and it is also 
obvious that here inequality can hold. Hence the following is a strengthening of 
Theorem 2.1. 


Theorem 6.1. x(G) <Col(G) for every graph G. 

However, the proof of Theorem 2.1 we described gives Theorem 6.1 as well. 
The example of x, , shows that the coloring number can be arbitrarily large, while 
the chromatic number is two. 

Note added in proof. This concept was later introduced in finite combinatorics 
under a different name. In Bollobdas (1978) G is called k-degenerate iff Col(G) < 
k+1. 


The following is another nice fact to Know about the coloring number. 


Theorem 6.2. Assume Col(G)= x. Then G has a «-well-ordering < with 
typ V(<) = |G], the smallest possible type. 


et 
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I leave the proof to the reader. 

A well-known result of Erdés (1959) (see chapter 4) tells us that for every rE w 
there is a finite graph G, with chromatic number and gisth larger than 7; and as a 
corollary of this, for every r € w there is a countable graph, with girth larger than 
r. The story for uncountable chromatic number is quite different. We will show 
that a graph of uncountable chromatic number must contain every even: circuit. 
On the other hand, let us define the odd girth of a graph as the length of the 
smallest odd circuit contained in it. We will show that, for every «x 2 andr<w, 
there are graphs y(G)=« and odd girth greater than 7, even such that |V| =«. 
We will also prove that for a graph of chromatic number >w, there is an r<@ 
such that it contains all circuits of length >r. For the history of these results see 
the paper by Erdds and Hajnal already mentioned. 

More generally, it was proved in the above paper that a graph of uncountable 
chromatic number must contain a K,.., for every r<@ and hence, every finite 
bipartite graph, 

We are going to prove an even stronger result of Hajnal and Komjath (1984). 
For this we have to define (the isomorphism type of) certain special graphs. H, , 
is the A by A half-bipartite graph. The vertex set of H, , is the disjoint union of 
two well-ordered sets {x,:a@<A} and {y,: a@<A}, and the edges of H, , are the 
pairs {x,, ¥g}, for a<f. Note that the vertices x, have degree A while the 
vertices y, have degree <A. The graph H, , + 1 consists of H, , with one added 
point y adjacent to all x,; H,, +2 is H, , with two added points y and z both 
adjacent to all x,. H, , + @® is H, , with A* points C added , such that there are 
y€C and a sequence C=CyD-+--DC,D°-- for a@<A with 1,2, C, > {y}, 
|C,|>A and C, C N(x,) for a <A. Hence H,, + ® contains H, , + 1 and K,,, 
for r<w. 


’ 


Theorem 6.3. Assume Col(G) >. Then G contains H.,,_, + ®. 


Proof. We prove the statement by transfinite induction on the cardinality of G. 
The theorem is obviously true for all G with |V| <w. Assume |G] = « >, and 
the statement is true for all graphs of size less than «. We also assume that 
H, + @®ZG, and we will prove Col(G)<. For an arbitrary finite subset 
X CV, we will denote by CN(X) the set of common neighbors of X, i.e., the set 


(\{N(v). ve X). 


Now we define F: [V]°° > |[V]*® as follows. For X¥ €[V]~”, let F(X) = CN(X) in 
case CN(X) is countable, and F(X) = @ otherwise. We will say that a set AC V is 
closed if it is closed with respect to F, i.e., X €[A]*° implies F(X) C A. It is easy 
to see that for each B CV there is a smallest closed set containing B. We will 
denote this by B. State that |B] = |B} holds for any infinite B. Now again a 
standard argument gives that for V= « there is an increasing continuous sequence 
{A,:a<x«} of closed subsets of V such that |A,|]<« tor a<«. Indeed, this 
sequence can be defined by transfinite recursion so that A, ,, is always a closed 
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extension of A,, and A, = U,.,A, for every limit ordinal B. Here we use 
the fact that the union of an increasing sequence of closed sets is closed. To ex- 
haust V, one has to fix a well-ordering < of type x of V, choose the <-minimal 
element x, of V\A,, and set A,;,=A, U {x,}. Choosing A, =4, it is easy to 
prove by induction that |A,,|<Ja] +o <«. Let B, =A, ,;,\A,. By the continuity 
of the A, sequencc, the “rings” B, are disjoint and their union is V. We now 
claim that |N(y)NA,|<o@ for y€B,,a@<«. Assume indirectly, that for some 
a<« and yEB,.N(y)OA is infinite and Iet {x,:2<w} be a one-to-one 
enumeration of a subset of it. Let X, ={x,,...,x,} for nGw, C,=CN(X,). 
Now y € CN(X,,), and hence, since A, is closed and y Z A,, we have CN(X,) 7 
F(X,,) and then, by the definition of F, |CN(X,)| >. This contradicts the 
assumption that H, , + ® ZG. Now, by the induction hypothesis, H,, ,, + ®2@G 
implies that for each G[B,], there exists an w-well-ordering <,. Define the 
well-ordering < by B,<B, for a<B, and <=<, on B, for a, B <x. Then 
N(<, x) =(N@)OA,)UN(<,,x) for x EB. Hence M(<,x), a union of two 
finite sets, is finite forxEV. 0 


It is proved in Hajnal and Komjath (1984) that there is a graph of size 2™" with 
x(G) > not containing H, , +2. We do not give a proof here. 


Next we give the proof of the theorem of Erdés et al. (1972) already 
mentioned. 


Theorem 6.4. Assume x(G)>w. There is an r<w such that C,C G for all s >r. 


Proof. First we mention two clementary facts: (i) assume V= Uy, Vy, G= 
(V, E); then 


x(G)< 2% x(GIVs)) (6.4a) 
and (ii) assume Gg: B <a are graphs on the same vertex set G = U,..., Gg; then 
x(G) = x(G,) - (6.4b) 


Let G=({V,E), x(G) > be given. We may assume that G is connected. Let 
x €V, and let A; be the set of y with distance i from x. By (6.4a), there is an 
iEN with y(G[A,]) > w. Let G[A,] = G’, A, = V’. Note that for any pair y #z € 
V’ there is a walk of length 2i from y to z with inner points outside V’; hence 
there is a path of length 2), j = j(y, z), 1 <j <i, with the same property. Now split 
G’ into the union of Gj according to what j(y,z) is. By (6.4b), for some 
j, x(G;) >, and hence, by Theorem 6.3, for every k =2 there is a C,, in Gj. 
Replacing an arbitrary edge {y, z} of this C,, with the even path of length 2j, we 
gct an odd circuit of length 27+ 2k —Lfork>2. O 
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Note that using the stronger form of Theorem 6.3, we get an edge of G which is 
contained in all circuits of length =r for some rE w. 

We now turn to the examples mentioned before. 

Let (A, <) be an ordered set, r=2. The r-shift-graph on (A, <) is a graph 


Sh,(A, <) with vertex set [A]. Two vertices (xy... -.X,-1}5 {¥or- +o Va E 
[A]’ are joined in Sh,(A, <) if and only if x,<-+--<x,.\. Yy<-°> <y,. 4, and 
either yp =X,,---5Y,-2 =X,-1 OF Xp =Vy,- ~~ Xe =V, 1 


We also define the 2-shift-graph of a graph G=(V,E) with respect to the 
ordering < of V. This will be denoted by Sh(G, <). The vertex set of this graph is 
E, while two pairs (x, y}, {z, w} © FE, x < yz <<, are joined in Sh(G,.<) if and 
only if either y = z or x = w. Note that Sh(G, <) is a subgraph of the line graph 
L(G) of G. The following Iemma establishes a connection between the two 
concepts defined above. 


Lemma 6.5. Let r =2, and let (A, <) be an ordered set. There is an ordering <, of 
[A]’ such that 


Sh, ,,(A, <) = Sh(Sh,(A, <), <,). 


Proof.. Identify [A]’ with the set of < increasing sequences of length r from A 
and choose <, as the lexicographic ordering. 0 


Lemma 6.6. Assume G=(V,E) does not contain a C,,,, for j<k, for some 
k 20. Then Sh(G, <) does not contain a C),,, for i<k +1 for any ordering < of 
V. 


Proof. Assume éy,...,€2; is a circuit of Sh(G,<) without a chord, for some 
i<k +1. It is easy to see that by deleting all e, for which e,_, and e,,, have the 
same point in common with e,, we obtain a circuit C of G. An e, was deleted if 
and only if either its upper endpoint is a local minimum of C or its lower endpoint 
is a local maximum of C in the ordering <, and both cannot hold for any e,. 
Hence the number of deleted e, is even. O 


Corollary 6.7. Assume r= 2, and let (A, <) be any ordered set. The odd girth of 
Sh,(A, <) is at least 2r+ 3. 


Proof. By induction on r. It is clear that Sh,(A,<) does not contain a K;. 
Assume r = 2 and that the statement is true for Sh,(A, <). Then, by Lemmas 6.5 
and 6.6 it is true for Sh,,,(A, <) as well. O 


Theorem 6.8. Assume |A|>exp,_,(«) forr22 and x =m. Then y(Sh,(A, <))> 
« for any ordering < of A. 


Proof. Assume f:[A]’— « is a good coloring of [A]", which is the vertex set of 
Sh,(A, <). By the Erdés—-Rado theorem 4.8, there is a subset {x),...,x,} CA, 
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Xy<-++<x,, homogencous for f. But then f({x»,....%,-;}) =fC{x,,..-.%,}), 
and {xy,...,X,..,} is adjacent to {x,,...,x,} in Sh,(A, <), a contradiction. O 


As a corollary of Corollary 6.7 and Theorem 6.8, for every r=2 and k = w, we 
obtain a graph with odd girth greater than 27 1 1 and having chromatic number 
greater than «. Since the cardinality of this graph is much larger than « *, we must 
give another example as well. I] described the shift-graphs in detait, because, | 
think they are quite basic in both finite and infinite combinatorics. They provide 
an casy answer to quite a few problems. To illustrate this point | prove a theorem 
of Erdés and Hajnal mentioned earlier. 


Theorem 6.9. Assume «x 2 w. There is a graph G with y(G) > « on (2°) vertices, 
such that every subgraph G' with 2” vertices has chromatic number at most «. 


Proof. Choose G as Sh,((2")*, <). By Theorem 6.8, we have x(G) > x. To prove 
the second property of G we need the following lemma. 


Lemma 6.10. Assume (A, <) is an ordered set, and |A| <2" for some x 2 w. Then 
Sh,(A, <) has chromatic number at most x. 


For the proof of this lemma we need another lemma. Let ¥ be a family of sets. 
# is said to be a Sperner family if AZ B for any pair A# BEF. 


Lemma 6.11. Assume x =. Then there is a Sperner family ¥, |¥| = 2", consist- 
ing of the subsets of «. 


Proof. Jt is sufficient to prove this lemma choosing any underlying set of size «. 
Hence, let X¥ =« X2. For any ACxk, let Fy={@,1):a@€ ASU {a,0):a€ 
«\A}. It is clear that ¥ = {F,: ACK} is a Sperner family of size 2" of subsets of 
xX. Oo 


Proof of Lemma 6.10. Let {F,:a@€A} be a one-to-one enumeration of some 
Sperner family of size <2" of subsets of «. For {a,b} €[A]’, a<b, define 
f({a, b}) = min_(F,\F,). Then f is a coloring of the vertex set of Sh,(A, <) with 
« colors. We prove that f is a good coloring. Let a<b<c,a,b,c€ A, be an edge 
of the shift graph Sh,(A,<). Then f({a, b))<h, and f({b,c/)EF,, hence 
f(a, b})# fb, c}). 0 


We now define the generalized Specker graphs Sp, ,(«) for r=3, s<r, and 
k =w. The vertex set of Sp, (x) is the set of increasing sequences of length r with 
values from «x. This will be denoted by V, (x). Let x = (x), ...,x,_,) denote the 
general element of V, .(«). Let x, yEV (x). x9 <yy, x and y are joined in V, (x) if 


XSW SX yyy SVS SS, 
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Theorem 6.12. Assume x >w is regular, iGN. Then Sp,,2,,;, is a graph of 
cardinality and chromatic. number « with odd girth at least 2i + 3. 


Proof (Sketch). As to the odd girth of the graph, for i= 1. it is very easy to sce 
that S, ,(«) does not contain a triangle. Indeed, if x, y,z are three vertices with 
Xq<Yq<2Z,q and x is adjacent to_y and z’, then both y, and z, are in the interval 
(x,,x,); hence y and z are not joined. This was Specker's original idea. 

The general statement is a cumbcrsom exercise in finite combinatorics and is 
left to the reader. 

The claim for the chromatic number follows from the following facts. We define 
full size subsets of V, for r EN. ACV, is full size if [A= «. ACV,,, is full size if 


{XpEK: {(xX,,...,x, )EV,: (xy,...,x,) EA} is full size in V,} 


has cardinality x. 
Prove first that for every coloring of V, with fewer than « colors there is a 
monochromatic full size subset A of V,. Finally prove that if A is a full size subset 


of V,, then there are (t)..--,%,-1), (Yo.--++¥,-1) EA for any prescribed 
pattern %q@ <2 9° <x, <q << yy St 


Appendix 


{x}, {x, y}, {x, y,z}, ete., denote sets with elements x;x and y;x, y and z, and 
so on, respectively. (x, y) = {{x}, {x, y}}, @&. y, z) = (x, y), z), etc., are ordered 
pairs and ordered triples, respectively. {x € A: (x)} denotes the set of elements 
of A having property @. Relations and functions are sets consisting of ordered 
pairs. Dom(A) and Ran(A) are the sets of first and second elements of ordered 
pairs contained in A, denoted the domain and the range of a function, 
respectively. . 

The sets A and B are equipollent if there is a one-to-one mapping f of A onto 
B. We write A ~ B in this case. 

The pair (A, <) is an ordered set if < is an irreflexive and transitive relation on 
A so that exactly one of a<b, a=b, b <a always holds. An ordered set (A, <) is 
well-ordered if every non-empty subset of A has a <-minimal element. The 
ordered sets (A, <) and (A’, <') are isomorphic or similar if there is a onc-to-one 
mapping of A onto A’ which is strictly monotonic. We write (A, <) =(A’, <’) in 
this case. 

We will need some specific sets called ordinals. A set A is called an ordinal 
number or an ordinal for short if A consists of sets, every element of A is a subset 
of A and e€ well-orders A. The last statement means more precisely the following. 
If «,={@, y):x, yEA and x€y}, then (A,€,) is a well-ordered set. For 
example, it is easy to see that 8, {6}, {6, {B}}, (0, {0}, {8, {0}}}, .. . are ordinals. 

In what follows a, B,y,...,,¥,&,¢ run over ordinals. Define a<B by 
a & B. The following are casy consequences of the definitions. 
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Corollary A.1. < is a well-ordering of the ordinals, every element of an ordinal is 
an ordinal, each ordinal is the set of all smaller ordinals, a < B is equivalent to 
aC B, a <® is equivalent to aC B. 


For @ an ordinal, define a +1=aU {a}. a +1 is the smallest ordinal greater 
than @. B is a successor ordinal if B = a +1 for some a. A non-successor ordinal 
different from 0 is a fimit ordinal. An ordinal B is finite if it is not a limit ordinal 
and the same holds for every element of 8. The set of all finite ordinals is denoted 
by w. We identify the non-zero elements of w with the natural numbers 


w =NU {0}. 


The reader should be warned that we only invoke the above convention if it is 
suitable for the particular purpose at hand. 

A set ts finite if it is equipollent with a finite ordinal. It can be proved (using the 
axiom of choice) that a set is finite if and only if it is not equipollent to any proper 
subset of itself. 

The next theorem describes the concept of cardinality. 


Theorem A.2. There is an operation associating with each set A an ordinal \A|, 
called the cardinality of A, such that: 


|A| is the smallest ordinal equipollent with A. 


The possible values of |A} are the cardinals. 

The cardinalities of finite sets are the finite cardinals, the rest are the infinite 
cardinals. By Corollary A.1, the finite cardinals are the non-negative integers. w 
is the smallest infinite cardinal. A set A is called countable if |A]<o. 
i, j,k, l,m,n,r,8,¢ run over non-negative integers. «, A, p,@,7 run over cardi- 
nals. 

The next theorem describes the concept of order type. 


Theorem A.3. There is an operation associating to each ordered set (A, <), an 
object typ A( <) called the order type of A in the ordering < such that for every 
pair (A, <), (A’, <‘) of ordered sets (A, <) =(A', <') if and only if typ A(<) = 
typ A’(<'), and for every well-ordered set (A, <), typ A(<) is the unique ordinal 
a with 


(A, <) = (a, €,). 


Note that cardinals being ordinals, have an ordering inherited from the 
ordering of ordinals. The following result shows that this is the same as the 
traditional ordering. 


Theorem A.4. For any sets A, B|A|<|B| if and only if A is equipollent to a subset 
of B, but A and B are not equipollent. 
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Addition, multiplication. and exponentiation of non-negative integers can be 
extended to infinite cardinals in such a way that: 


Fact A.5. For any pair of disjoint sets A and B,|AU B|=|A|+|B|. For any pair 
of sets A and B, |A|-|B|=|A x Bl and |B,|=|Al|!"!, where A x B is the Cartesian 
product of the sets A and B, and °A=({f: f is a function mapping B into A}. 


More generally: 


Fact A.6. For any sequence (A,: a@<B) of pairwise disjoint sets |U.<g Aa| = 
Ve<plAgl, and |X, <p Aul =I, 1A,| for any sequence of sets (A,: a <8). 


Here X,<, A, is the Cartesian product of the sets A, consisting of all choice 
functions for the family (Ag: B<a), ice., 


fEX pA, 


if and only if Dom(f) =8 and f(a)E A, for all a <B. 
The next result is the fundamental theorem of cardinal arithmetic. 


Theorem A.7. «* =« for all « >. 


This result has the following corollaries: «A = « + A = max{x, A} provided one 
of the cardinals «, A is infinite, and : «'=« for k =>, i.e., the set of all finite 
sequences formed from a set of cardinality « is of cardinality «. 

x” denotes the smallest cardinal greater than x. Hence x’ = « + 1 for finite x. 
Theorem A.7 easily implies that for x =w 


x" ={a:la|=x}, 


i.e., «* is the cardinality of the set of ordinals having cardinality «. 

It is easy to see that for a set A of ordinals, UA = Ula: a€ AS=U,., is 
the smallest ordinal which is greater than or equal to each element of A. Hence 
UA is also denoted by sup A. It is also easy to see that if A consists of cardinals, 
then sup A is a cardinal, and as a consequence of Theorem A.7: 

sup A = > A provided sup A is infinite . 


AEA 


As to the multiplication, the situation is quite different as proved by J. Konig. 
Theorem A.8. Let A, <K, for B<a. Then 


by a,< IT Ky - 


p<a Bru 


Choosing A, = 1, «, = 2 we sce that this is a generalization of Cantor's classical 
result {a{<2!*!, and it can be proved by the same diagonal method. 
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The following concepts are fundamental for investigating cardinal arithmetic. 
Let (A, <) be an ordered set, BCA. (A, <) is cofinal with B if for all aG A 
there is a b€ B with a <b. For example, (R, <) is cofinal with N. The following 
is a basic theorem duc to Hausdorff. 


Theorem A.9. Let (A, <) be an ordered set. There is a subset B C A such that B is 
well-ordered in the ordering inherited from A, A is cofinal with B, and 


typ B(<)<]A]. 


On the basis of this theorem one can define cf(A, <), the cofinality of an 
ordered set (A, <), as the smallest ordinal a such that (A, <) has a cofinal subset 
of type a. It is a useful fact to know that if B is a cofinal subset of (A, <), then 
cf(A, <) =cf(B, <). Since the cofinality of similar ordered sets is the same, cf(@), 
the cofinality of the order type @, can be defined. When @ is an infinite cardinal, the 
cofinality has the following characterization. 


Theorem A.10. For « =a, cf(x) is the smallest cardinal A such that x = ies Kp 
for a sequence of cardinals («,: B <A) with k, <« for B <A, i.e., « is the sum of d 
many cardinals all of which are smaller than X. 


Since k = nes 1, we clearly have cf(k) <x. Let « be an infinite cardinal. « is 
called regular if cf(x)=«; otherwise it is singular. For example, w is a regular 
cardinal, since the sum of finitely many finite numbers is finite. It is an easy 
theorem that for an arbitrary order type @ # 0 either cf(@) = 1 or cf(@) is a regular 
cardinal. A cardinal « is a successor cardinal if x =A” for some A; otherwise it is 
a limit cardinal. It follows from Theorem A.7 that every infinite successor cardinal 
is regular. w is a regular limit cardinal. A regular limit cardina) is called 
inaccessible. This expression was introduced since, by definition, these cardinals 
cannot be obtained from smaller cardinals using the successor operation and 
addition. If, in addition, this holds for the exponentiation too, i.e., if 2*<« for 
all A<« for an inaccessible cardinal x, then « is called strongly inaccessible. 
Clearly, w satisfies this requirement as well. 

It cannot be proved from the axioms that there are inaccessible cardinals 
different from w. The assumption that such cardinals exist is an extension of the 
axiom-system of set theory, and is called a large cardinal axiom. We discussed 
more about this in section 5. 

Induction and recursion can be generalized for ordinals. The following is a 
formal statement of this fact. 


Theorem A.11. (i) Let (a) be a property of ordinals. Assume that for each a, if 
®(B) holds for all B <a, then (a) is true as well. Then P(a) holds for every a. 

(ii) Assume G(A) is an operation associating sets to sets. Then there exists an 
operation #(a) defined uniquely for all ordinals a such that 


F(a) = GF ta) foralla. 
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Here ¥ f @ is the restriction of ¥ to a, i.., Dom(¥ | a)=a and 
F¥ | a(B)= ¥(B) for each B<a. 

Using Theorem A.11, one can define an operation on ordinals listing all infinite 
cardinals in increasing order. Cantor introduced this operation, and denoted it by 
S (aleph), the first letter of the Hebrew alphabet. So No,8,,...,8,,--. list all 
infinite cardinals in increasing order. It is-clear from the definitions that N, is a 
successor cardinal if and only if @ is a successor ordinal, €.g., Ny,8,,---58,5--- 
are tegular cardinals and X,, is a singular cardinal with cf(X,,) =, = @. Note that 
many papers use w, as an alternative notation for \,, and we will follow this 
practice. 

One of the few old results on cardinal exponentiation is the following theorem, 
successively developed by Bernstein, Hausdorff and Tarski. 


a? 


Theorem A.12. Assume x 2=w, O<A<cf(«). Then 


a(S he. 


T<K,7 a cardinal 


The continuum hypothesis is the assumption that 2°» = 8,. This will be denoted 
by CH. The generalized continuum hypothesis (GCH) is the assumption that 
2" =«* holds for all infinite «, or equivalently, 2°*=%8,,;, holds for all a. It is 
well known that both are consistent with the axioms of set theory and neither of 
them can be proved from the system, provided the system itself is consistent. 

For the time being, we only mention that if the GCH is assumed, we can 
compute the cardinal power «“ for all infinite x. Using Theorems A.8 and A.12, 
it is easy to see that 


«’=1, for0<A<cf(k), 
kK“ =x" forcf(k) <A<k , (A.13) 


Kk‘ =A* ford2n. 


For a long time it was thought that Theorems A.8 and A.12 contain all 
information for cardinal exponentiation one can obtain, without additional 
assumptions. This is not so, and there are quite a few additional non-trivial 
inequalities. We refer the reader to Silver (1975), Galvin and Hajnal (1975), and 
Shelah (1982). 

We will assume that the reader is familiar with all forms of Zorn’s lemma, and 
the weil-ordering theorem. We omit giving a list of them. We only state an 
important consequence. Let an underlying set X be fixed. A family # of subsets 
of X is said to have the finite intersection property if (\¥' A for all finite 
¥'C&¥. A family U of subsets of X is a filter if U #9, BEU, and AEU and 
ACB imply B EU, and U is closed with respect to finite intersections. A filter U 
is an ultrafilter if for cach AC X cither A or X\A belongs to U. 
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Theorem A.14. If ¥ is a family of subsets of X having the finite intersection 
property, then ¥ extends to an ultrafilter in X. 


It is easy to see that any extension of ¥, which is maximal for having the finite 
intersection property, satisfies the requirements of Theorem A.1t4. 

An ultrafilter is free or non-principal, if {x} @U for every x EX. Since the 
cofinite sets of an infinite set X form a filter, it follows from Theorem A.14 that 
every infinite set carries a free ultrafilter. 

Finally, we will need an important tool of clementary sect theory which is 
probably less well known than the ones listed above. 


Definition A.15. Let « > @ be a regular cardinal. B Cx is a club-set (in x) if is 
cofinal with B, and B is closed in the order topology of «, i-e., for all B'C B, 
B' #0 if B’ Ca@ for some a <«, sup B’ € B. The term “club” comcs from closed 
unbounded. 


For each a<x«, x\a is a club-set. However, there are club-sets with a 
complement of size x. For example, {a <«:qa@ is a limit ordinal} is such a set. 
Still, club-sets are very large as is shown by: 


Lemma A.16. If ¥ is a family of club-sets in x and |¥|< x, then VF is a club-set 
in x for k =cf(k) >a. 


By this lemma, the sets of x which contain a club-set form a filter, and the 
intersection of fewer than x sets in this filter belongs to the filter. We call such 
filters x-complete. We think of sets in the filter as large sets and their complements 
as small sets. 

AC « is stationary in x if AN B #9 for all club-sets B in x. By Lemma A.16, 
all club-sets are stationary. Stationary subsets of x are those, which are not small, 
but also not necessarily large. We need a characterization of these sets. 

Let A Cx« and f be a function such that AC Dom(f). Then f is regressive or 
pressing down on A if f(a) <a@ for aE A and a #0. 


Lemma A.17. A Cx is stationary in « if and only if for all regressive functions f on 


A, there is a B <x with |f '({B})N Al =x, ie., every regressive function on A 
takes one of its values on A « many times. 


The following is the main tool for handling these concepts (see Fodor 1956). 
Theorem A.18. Let « >w be regular. AC x is stationary in « if and only if for 
every regressive function f on A, the set f~'({B})NA is stationary in « for some 


B <k, i.e., every regressive function on A takes one of its values ‘‘stationary many 
limes”. 


We will use the following elementary fact. 


Ae ee RCA TCE IRE HE CERNE STEP TOLLEY TON CAC ORT et any 
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Fact A.19. Assume A<« are regular cardinals. Then the set A, ={a<k«: 
cf(a) = A} is a stationary subset of «. 


As we have already mentioned, stationary sets are not necessarily large. There 
are stationary sets whose complement is stationary. For example, by Fact A.19, 
for x =@,, A,,, and A, ,, are both stationary. 

It is somewhat harder to find two disjoint stationary subsets of @,, since in this 
case the above simple minded argument does not work: In general, the following 
result of Solovay (1971) holds. 


Theorem A.20. Assume x >w is regular, and A Cx is Stationary in x. Then A is 
the disjoint union of «-many Sets, each Stationary in x. 


We are not going to usc this result, which is not casy to prove. We only stated it 
to show that Lemma A.16 can not be generalized to stationary sets. } want to 
emphasize the importance of the conccpts defined in A.15 once more. Countably 
complete filters having a natural definition played an important role in the 
development of various branches of mathematics. Subsets of Lebesgue measure 1 
of [0, 1], and comeager sets are the prime examples. Club-sets and stationary sets 
play a role in set theory similar to the role of Lebesgue measure in the theory of 
real functions. 
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Combinatorial games ; 2119 
1. Introduction 


We only skim the surface of this vast subject. For more breadth, depth and detail, 
consult both of the following books: Winning Ways for your Mathematical Plays 
by Berlekamp et al. (1982) and On Numbers and Games by Conway (1976) which 
we will frequently refer to as WW and ONAG, respectively. 

Two other surveys are by Fraenkel (1980), who considers the complexity of 
games, and Guy (1983), who explores the connexions between games and graphs. 

Fraenkel contrasts Nim with Go, the former having a very simple winning 
strategy and the latter very complicated. Nim has no cycles in its game graph, no 
interaction between tokens, and is impartial; Go has cycles and interaction, and is 
partizan. The spectrum between the two games spans the complexity gap between 
polynomial, P-space-complete and Exptime-complete games. In existential prob- 
lems such as the travelling salesman problem, high complexity is a liability, but in 
games and cryptanalysis, it can be an asset. 

Fracnkel also maintains a valuable bibliography of the subject, copies of which 
may be obtained from him at the Weizmann Institute, Rehovot, Isracl. 

Guy surveys the connexions between combinatorial game theory and graph 
theory: graphs of games; games on graphs (Hackenbush, von Ncumann’s game, 
Rims, Rails, Lucasta, Sprouts); the ways graphs can be used to elucidate puzzles 
(Tantalizer, Rubik’s Cube, Fifteen Puzzle, magic squares); and the occurrence of 
Euler’s formula in Berlekamp’s analysis of Dots-and-Boxes (WW, pp. 507-550). 


2. What is a game? (WW, pp. 16-17) 


Our games, unlike those that have found application in economics, management, 
and military strategy, are completely determined. There is complete information: 
the players know exactly what is going on; there is no bluffing. There are no 
chance moves: no dealing of cards; no rolling of dice. We have only two players, 
Left and Right: there can be no question of coalitions. 

For cach given position in a game, the rules define two sets of options, available 
respectively to Left and to Right, leading to other positions. Left and Right 
alternately choose an option, i.e., play to a position, specified by the rules. 


3. Game graphs and trees 


A game may be visualized as a digraph: the nodes are the positions and the arcs 
are the options. The arcs may be thought of as colored, say 


bLuc, Red, or grEen, 
according as the option is available 


to Left only, to Right only, or to Either player. 
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Alternatively, we may distinguish between different plays of the game, i.e., 
different dipaths in the digraph, by duplicating the nodes as necessary and 
representing the game by a rooted tree. The root is the starting position and the 
arcs are directed away from the root. 

Figures | and 2 show the game graph and the game tree for the position {3, 2} 
in a game of Nim: two heaps, one with three beans, the other with two. Nim is an 
example of an impartial game, in every position of which the same set of options 
is available to either player: think of the arcs in figs. 1 and 2 as being colored 
green. Nim is played with a number of heaps of beans. The typical option, for 


Figure 1. The game graph for the Nim position {3, 2}. From the left-most position the next player can 
win by adopting the strategy indicated by the heavy arrows. 
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Figure 2. The game tree for the same Nim position. The Root is 43,2}, and the ares are directed 
upwards. 
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either player, is to choose a heap and remove from it as many beans as you wish: 
the whole heap maybe, but at least one bean. 

Notice the difference between the complete analysis of a game, and a winning 
strategy. Figure 2 is a complete analysis: for a winning strategy it suffices to 
describe the four black arrows. © 

You may mentally identify three seemingly different aspects of the same idea. 

(a) A game, ize., the whole digraph or tree representing the game. For 
example, the Game of Chess, as opposed to a game of chess, which we refer to as 
a play of the game: compare le jeu d’échecs, une partie d’échecs. 

(b) A position in a game; a particular node of the digraph, perhaps the root of 
the tree. For example, the standard opening position in Chess, ready for a play of 
the game. 

(c) The ordered pair of sets of options available to the two players from a given 
position, e.g.,  {a3,a4, ... , h3,h4,Sa3,Sc3,Sf3,Sh3 | a6,a5, ... , h6,h5,Sa6,Sc6, 
Sf6,Sh6}. 

A position, such as the rightmost in fig. 1, or any zero in fig. 2, from which 
neither player has any option, is a terminal position, at which the game ends. The 
outcome is then specified by the rules. It may be a win for Left, or a win for | 
Right, possibly accompanied by some score or payoff. The rules may not specify a 
winner, so that the game may end in a tie. For almost all of this chapter, however, 
we will adopt the normal play convention that the winner is the player who has 
just made the last move: equivalently, last player winning, if you cannot move, 
you lose. Near the end we will say something about the misére play convention, 
which accords the win to a player unable to move: last player losing. Analysis is 
far more difficult in this case. 

To ensure that we have a last player, our games must end. We assume that they 
satisfy the ending condition: that there is no infinite sequence of options. Notice 
that this condition prohibits all infinite sequences, not merely those in which Left 
and Right make alternate moves. In order to give values to our games, we need to 
consider the possibility of several consecutive moves by the same player. This can 
occur in the play of the sum of two or more games, as we shall see. 

A game that does not satisfy the ending condition is called a loopy game. Its 
digraph will contain a directed circuit or an infinite directed path. The outcome 
may be a draw: note that we distinguish between a tied game and one drawn out 
by infinite play. Chess exhibits both kinds of outcome: stalemate is a tie, but 


perpetual check, repetition of moves, or insufficient mating material are equiva- 
lent to draws. 


4. The formal definition of a game 
This is deceptively simple, each game is an ordered pair of sets of games: 


G=({G6",G%,...3|{G",G",...33 
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To avoid proliferation of braces, we write this more compactly as 
vi. +R 
G= {G IG } 


where we must remember that G" and G" are sets of Left and Right options, 
which may, for example, be infinite, or empty. Indeed the definition is inductive, 
and the empty sct is the basis for the induction, which starts with the Endgame 


(O[H}={ | } 


in which neither player has an option, and which we will denote by 0 (zcro). 

Here, and from now on, we use familiar symbols, with the strong implication 
that we can manipulate games in the same way that we manipulate numbers in 
ordinary arithmetic. Some games behave like numbers and we call them numbers, 
but to justify the manipulations takes more space than we have here, so turn to 
(ONAG, pp. 71-96) if you would like more detail and further examples. 

\t is helpful to attach ordinal numbers, or birthdays, to games, and to introduce 
the idea of simplicity (WW, pp. 23-27). When a move is made in a game, it 
becomes simpler in the sense that we arrive at a position with an earlier birthday. 
All definitions and proofs are inductive in that they are assumed to have been 
made for all simpler games. The basis is the simplest game of all, the Endgame, 
born on day zero. 

On day one we have two sets, the empty sct and the set {0} consisting of the 
Endgame, so that we can visualize 2’ games. Their game trees (in which Left 
moves slope up to the left and Right moves slope up to the right) together with 
their names are shown in fig. 3. 


e 
v={]} = {0|} ~1={ jo} * = {0|0} 


Figure 3. The four simplest games, born on days zero and one. 


We quote from (ONAG, p. 72): 


“The simplest game of all is the Endgame, 0. [ courtcously offer you the first move in 
this game, and call upon you to make it. You lose, of course, because 0 is defined as 
the game in which it is never legal to make a move. 

In the game 1 = {0| }, there is a legal move for Left, which ends the game, but at 
no time is there any legal move for Right. If I play Left, and you Right, and you have 
first move again (only fair, as you lost the previous game) you will lose again, being 
unable to move even from the initial position. To demonstrate my skill, I shall now 
start from the same position, make my legal move to 0. and call upon you to make 
yours. 

Of course you are now beginning to suspect that Left always wins, so for our next 
gamc, —1, you may play as Left and Las Right! For the last of our examples, the new 
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game + = (O{O}. you may play whichever role you wish, provided that for this 
privilege you allow me to play first.” 
In summary: 
The Endgame is the prototype of games in which the next player loses, since no 
option is available: a second-player win, 
The game 1 is a Left win, no matter who starts: if Louise starts, she gocs to 
{ | } =0 and Richard has no option and loses; if Richard starts, he has no option 
and Joses even more quickly. 
The game —1 is a Right win, no matter ang starts. 
The game {0{0} = *(*‘Star”) is the simplest game which is not a number (WW, 
p. 40). It is a first-player win. 


5. The four outcome classes 


If we adopt the normal play convention, every game belongs to just one of four 
outcome classes (ONAG, Theorem SQ) which are exemplified by the four games 
we have just seen. The terminology and notation are displayed in fig. 4. 


If, in a game G Right starts 
& Lhasa & Rhasa 
winning winning 
strategy strategy 
ZERO NEGATIVE 
& Rhasa 
winning G=9% G<9 
strategy 
2nd wins R wins 


Left starts 


POSITIVE FUZZY 
& Lhasa 
winning G>0 clo 
strategy 
L wins Ist wins 


Figure 4. The four outcome classes. 


It is convenient to combine these outcome classes and symbols in pairs. 


If Left Right Left Right has a winning strategy 
provided Right Left Left Right starts, then we 

write G20 G<0 GIbO G<10 — corresponding to 

the Ist col ist row 2nd row 2nd col of fig. 4. 
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6. The negative of a game 


A device to breathe new life into an otherwise one-sided contest is to allow a 
novice opponent, when he feels he is losing, to turn the board around, to reverse 
the roles of the two players, to handicap his more skilled adversary, by asking her 
to defend what appcars to him to be an inferior position. ‘This replaces the game 
by its negative. Formally, the negative of G, 


-G={G"|-G"} 


is defined inductively (WW, p. 35). Remember that -G”, for example, is short 
for the set {-G"', ~G*®, ...}, whose members are simpler games than —G, and 
have been defined earlier. 


7. Sums of games 


There are many ways of playing two or more games simultaneously, but often the 
most natural is what we call the sum, or disjunctive compound (ONAG, p. 75, 
WW, p. 33). Nim, for example, is the sum of a number of games of one-heap 
Nim. In the sum of two or more component games, the player whose turn it is to 
move selects one component and makes a legal move in it: 


GtH={G'+H,G+H"|G°+H,G+H"*} 


Once again this is an inductive definition: G* + H, for example, represents the set 
of options {G"' + H, G®? + H,. . . } each of which is a simpler game than G + H, 
so that addition there is already defined. 

It is not hard to sce that sums are commutative and associative, that GG +0=G, 
and (ONAG, Theorem 51) that G + (-G) = 0. In that last sentence we have used 
zero in two quite different senses. In G+0=G we intended 0 to mean the 
Endgame, { | }. In G+ (—G) =0 we intended “= 0” to mean “is a zero game”, 
i.e., “belongs to the (very large!) equivalence class of games for which the second 
player has a winning strategy”. Check that 1+ (—1) =0 and * + * = 0, so that we 
can speak of the games 1 + (—1) and * + * as having the same value, 0, as that of 
the Endgame, even though their forms are different. 

More generally, we will say that two games are equivalent, and have the same 
value, and write G = H, if the game G + (—H) is a second-player win. With the 
above definitions of sum, negative and zero, games form a commutative group. 
Moreover, games are partially ordered, and we write G>H just if G—H>0, 
i.e., if Left can win the sum G+(—H), no matter who starts. Our notation is 
justified by theorems such as the following, proved in {(ONAG, p. 76). If G20 
and H =0, then G + H =0. If H is a zero game (second player wins), then G + H 
has the same outcome as G. If H ~ K is a zero game, then G+ H and G+K 
have the same outcome. 
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As day two dawns, we have four games to play with, and so 2" = 16 sets of games. 
So there are 16 choices for Left’s options and 16 for Right's giving a potential of - 
256 games on day two. However, things are not quite that complicated, in that, 
for Left say, some options are clearly preferable to others. The four games born 
on day one can be arranged in the lattice (in the poset sense of chapter 8, rather 
than the geometrical sense of chapter 19) of fig. 5, in which Left’s preferences are 
higher, and Right’s are lower. 


Pa 
e 4 


Figure 5. The lattice of games born on day one. 


The only set of options for which there is any doubt in either player’s mind 
about the best move, is the incomparable pair {0, *}. So, for a player's options we 
need consider only six possibilitics: the empty sct, the four singletons, and this 
incomparable pair. Among the resulting 6° possibilities for games born on day 
two, just 22 are inequivalent and 18 are new. They are shown in fig. 6, the four 
quarters of which should be compared with those of fig. 4. These contain the zero 
game; six negative games; six positive; and nine fuzzy ones. The six sets of Right 


COLD Right's opttons RIGHT 
lo] 1]*|Ofea-t| 
ZERO NEGATIVE 


LEFT Right's opttons HOT 


Figure 6. The 22 games born on day two. 
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options run from left to right in increasing order of desirability from Right’s point 
of view; Left’s correspondingly downwards. 


f«l-f 


° °° Q 


pk {0K |-4 (i]o-} 


Figure 7. The game trees for the 22 games born on day two. 


*2 


Figure 7 shows the game trees for these 22 day-two games: the lowest node in 
cach case is the root, arcs sloping up to the Icft are blue, those sloping up to the 
right are red. The trees for Star and 16 of the day-two games have been 
condensed into digraphs in fig. 8; arcs labelled E are green and represent options 
available to both Left and Right. 

The 22 games are exhibited as a lattice in fig. 9. If two games are connected by 
an arc, or, transitively by a path of arcs, then the higher game has a greater value 
than the lower, as in fig. 5. 


Examples and exercises. 1+ 1=2, }+4=1, #++#=0, f= {0{*} = {0,*|*} >0 
(“Up is positive’), tT * = {0,*{O} =f 4% (“Upstar’), 0 |] #2 = (0, #10, #} (0 is 
incomparable with Star-two”), I*¥={1|1}=14+*, {1]0} >, l*= (00, *} {0 
(‘‘Downstar is incomparable with zero”), {1|0}>f*, (1[0}>+*2, (1/O}> = 
{*|O}, (1]O}> pe, C1JO}>+1=f1]-1}, C1JOPO, (LOR, (lp *} >, 
CHley et, Cefp ite. EPO, ey > L, CEO, [IT 


Nie 
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Figure 8. Digraphs for Star and 16 day-two games. 


Two important ways of classifying games are as hot or cold and as partizan or 
impartial. We will shortly make a brief attempt to distinguish hot from cold. 
Impartial games are those in which the two sets of Left and Right options are the 
same; in partizan games the two sets are different, in general. As illustrative 
examples, we describe two partizan games: Domineering is a hot game; Blue— 
Red Hackenbush is a cold one. 


9. Domineering 


This is also called Crosscram (ONAG, pp. 74-75 and 120-121, WW, pp. 117-120 
and 137-140). Left and Right alternately place dominoes so that they exactly 
cover two squares of a checker-board. Left orients her dominoes North-South 
and Right puts his East-West. Dominoes must not overlap each other or the edge 
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Figure 9. The fattice of day-two games. 


of the board. A player loses who can find no appropriately oriented space for a 
domino. After a while the available space may separate into disconnected regions, 
and the game becomes the sum of smaller games. Many of the games we have 


already seen are realized by small Domincering “boards”. Check the values in fig. 
10. 


10. Hot and coid games 


Domineering is an example of a hot game. These are the interesting games in 
which there is an advantage in having the move: the first player wins. Uf 
G={G"}G*), then the various differences G’ — G and G—G"“ are the (Left 
and Right) incentives of G. These are always negative if G is a number. Numbers 
are cold games and the Number Avoidance Theorem tells you: 


Never move in a 
Number, unless there is 
Nothing else to do. 
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Figure 10. Smalt Domineering boards realize all 4 day-one games (top left corner) and 13 day-two 
games. 


An earnest of the theory of hot games can be found in the work of Milnor 
(1953) and Hanner (1959). For recent developments, see (WW, pp. 141-182) and 
Berlekamp (1988), who is currently generalizing the theory of ‘overheating’ and 
making inroads into the difficult theory of the game of Go. 

A good example of a cold game is given in the next section. 


11. Blue—Red Hackenbush 


This is perhaps best played on a blackboard, using an eraser. Start with a picture, 
for example fig. 11, which is a graph, some of whose nodes are on the ground (the 
dotted line), and whose edges are either blue or red (ONAG, pp. 86-91, WW, pp. 
3-8). A Left or Right move is to delete a blue or red edge, respectively, together 
with any edges of either color which are no longer connected to the ground. 

If Right deletes the dog’s neck, for example, the head disappears as well. If 
Left deletes the body, no other edges disappear, but the picture breaks into the 
sum of two separate pictures. The aim, as usual, is to be the last player, the 
person whose move leaves no edges of the opponent’s color. 


comm Blu. 
a==p Red 


Figure 11. A Blue—Red Hackenbush picture. 
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12. When is a game a number? 


Although the valucs of Blue—Red Hackenbush pictures may be hard to calculate 
(in the technical sense; WW, pp. 210-212); they are all numbers. A game is a 
number (ONAG, p. 81) exactly if all its options arc numbers and no Left option 
is 2 any Right option. The game +1 = {1|—1} is not a number, since 1 > —1; 
Star is not a number, because 0 =0; and f = (0|*} is not a number, because * is 
not. Examples of numbers are: 


) born on day 0), 
{ and ~i born on day 1, 
4,-—5,2 and —2 born on day 2, 


4,3, —-3, 14, -14,3 and ~3 born on day 3. 


On day w all the remaining real numbers are born, as well as the first infinite 
ordinals, w = {0,1,2,...] } and —w ={ |0,—-1,~-2,...} and _ infinitesimals 
such as t/o = {0{1,4.4,...}. 

Values of games may be thought of as “number of moves advantage to Left’’. 
For example, —2 is two moves advantage to Right. The first four Blue~Red 
Hackenbush values in fig. 12 are clear. Deletion of the blue edge in the fifth 
reduces the picture to 0, while deletion of the red edge leaves 1, so the fifth value 
is {(O|1} =. 

Check that if you play a game comprising two copies of this and a single 
separate red edge, fig. 13, then the second player wins. (Although if Left starts, 
Right can make a bad reply!) Check the remaining values in fig. 12. 


ail ale A. 
fae oe eG 


Figure 12, Values of some Blue—Red Hackenbush strings. 


Figure 13. 
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Not only is the value of every Blue+Red Hackenbush picture a number, but 
every number can be represented by a Hackenbush string! To see this, work 
backwards from Berlekamp’s Rule for evaluating Hackenbush strings (Berlekamp 
1974): 


“The sign is determined by the color of the edge touching the ground (+ for blue, — 
for red). Move up the string until there is a change in cdge color. This first pair of 
differently colored ‘consecutive cdges represents the binary point. The number of 
edges below this pair gives the integer part of the number. Above the pair. label each 
edge with a binary digit. | or 0, according as its color agrees or disagrees with the 
ground color, and adjoin an extra digit 1 if the string is finite.” 


The rule is illustrated in fig. 14; also use it to check the values in fig. 12. 


Infinitesimals and infinite ordinals can also be represented by (infinite) Hacken- 
bush strings (WW, pp. 309-313). 


Figure 14. Berlekamp’s Rule for Hackenbush strings. 


13. Simplifying games 


We may be able to simplify a game, in either of two ways (ONAG, pp. 109-112, 
WW, pp. 62-64): 


by deleting dominated options or by replacing reversible options. 


We used the former implicitly when we madc our catalog of day-two games. If 
in a game G = {A,B,C,...|Z, Y¥,X,...} we have B = A (respectively Y < Z), 
then B dominates A (Y dominates Z) and A (Z) may be deleted, provided B(Y) 
is retained. 

Replacing reversible options is more subtle. A right option X is reversible if 
Left has a reply X” which is at least as good for her as the original game, that is, 
if X'=G. Then X may be replaced by the list of all the Right options, 
XR a {Xt xX! y of X*. Similarly, the Left option C is reversible if 
Right has a reply C® at least as good for him as G is, that is if C*® <G, and C may 
be replaced by the list of all the Left options C*” of C*. 
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For example, in the game 
G = {0,*|*} 


which is more precise description of the shape labelled “ft” in fig. 10, neither Left 
epien dominates the other, since *{{0, but the Left option * is reversible, because 

=0<G. (To see that G=0, note that if Right starts, his only option is 
* = {0|0} and Left plays to 0 and wins.) So the Left option * may be replaced by 
all the Left options of «*=0={ | }. That is, it may be replaced by all the 
members of the empty set, i.c., it may be deleted, and G simplifies to (0|*} = f. 


~ 

No, because if Left starts in H, 

she goes to f and wins. The 
ieosie Right option f is not reversible. 

H= {|} 
<H? : if Ri 
The Left option } may be a iy ies iy segment 
replaced by all the Left 1 + * or H and Left wins by 
options of ft" = *, ie., by 0. playing to ¢ in either case. 
H~ {0\1} 


Figure 15. Examining {f|1} for reversibility. 
For another example, consider the game H = {¢| ft} and examine each option 


for reversibility (fig. 15). Check that // satisfies the upstart equality (ONAG, p. 
77, WW, p. 73), 


{O\t}=t4+t+*=f* (‘doubleup star’). 


14, Impartial games (ONAG, pp. 112-130, WW, pp. 81-116) 


Remember that an impartial game is one where the set of Left options is the same 
as the sect of Right options. The impartial games that we have seen to far are 


{| }=0=0, {0]0}=*1=* and {0,*|0,*}= *2. 
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On day n, the game 


en = {*0,#1,*2,..., *(2 — 1) #0, #1, #2; 0..,*#(2 — 1)} 
is born. In fact any game of the type 
{*a, *b, *c,...|*a, *b, *c,.-.} 
has value *m, where m = mex{a,b,c,...}, the least non-negative integer not in 


the set {a,b,c,...}. To see this, notice that any option *n with n>m is 
reversible, because the option «m of *1 is = *m and so *n may be deleted in favor 
of the options of #m, namely 0, *, #2,..., *(m— 1). 

This is the inductive step which proves the Sprague—Grundy theorem (Sprague 
1935-36, Grundy 1939), which states that every (position in an) impartial game is 
equivalent to a nim-heap. 

As the Left and Right options are the same, *n is its own negative, #2 + *n = 0. 
Also, we need only write one set of options, and may define the nimber 


«n= (0, *,*2,...,*(n—1)}. 


This exactly parallels von Neumann’s (1923) definition of ordinal numbers. 

Recall that the game of Nim is played with several heaps of beans. A move is to 
select a heap, and to remove any positive number of beans from it, possibly the 
whole heap. Any position in Nim is therefore the sum of several one-heap Nim 
games. The value of a single heap of n beans is *n. 


15. Nim-addition (ONAG, pp. 50-51, WW, pp. 60-61) 


We know that #2 +0=#n and *#n++*n=0. Let us calculate *2 + +*= {0,+} + 
{0} = {0+ #,* + *, #2 +0} = {*, 0, #2} =*3. Add *, or *2, to cach side and 
obtain *2 = +3 + * and * = *3 + *2. In general, 


*a+*b = {0,*, *2,...,*(a— 1)} + (0, *, *2,...,4(6 — 1)} 
= {0+ +*b,*+*b,...,*#(a— 1) + #b, +a +0, ta + *,...,%a + *(b — 1)} 


and we can build a nim-addition table (table 1) by noting that the options of an 
entry are just the earlier entries in the same row and the earlier entries in the 
same column. Each entry in table 1 is the least non-negative integer not appearing 
as an earlier entry in the same row or column. For instance, *5 + *6 = %3, 
because 3 is the first number not in the set (5, 4,7, 6, 1,0, 6,7, 4,5, 2}, ie., the 
first six entries in row 5 and the first five entries in column 6. In, the usual 
language, 3 is the nim-sum of 5 and 6, which is sometimes written 5+6=3. 

Contrast the two equations +5 +%6=43 and 546=3. In the first, the 
summands are nimbers, i.e., values of impartial games, and the addition is a game 
sum. In the second, the summands are nim-values and the addition is nim- 
addition. 


Nim-addition is perhaps better known as addition without carry in base 2, 
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Table | 
Nim-addition table. The stars have been omitted, i.e., the entries are nim-values instead of nimbers 


0 ] 2 3 4 5 6 7 8 9 W Wt 12 13° 14 15 


vector or coordinatewise addition over GF(2), and XOR (exclusive or): it is 
reassuring that it also follows from the more general idea of game sum. 

To summarize the Sprague—Grundy theory: the nim-value of the sum of two 
impartial games is the nim-sum of their separate nim-valucs. Impartial games 
belong to one of only two outcome classes: all positions are either 


Y-positions previous-player-winning nim-value zero, or 


AN-positions — next-player-winning nonzero nim-value. 


In the literature, A-positions are sometimes called “safe” or “good” or “winning” 
without indicating which player enjoys this happy situation. 
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Nim-sequences and periods for subtraction games 
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Subtraction game 


Nim-sequence 


sq) 

S(2) 

S(3) 

SC, 2) 

S(1, 2,3) 

S(1, 2, 3,4) 
S(2, 4,7) 

S(2, 5,6) 

S(4, 10, 12) 
S(1,2,3.4,..) 


6101010101... 

601 1001)001100. . . 

QUOT 10001 LL000T110. . . 
6120120120120... 
01230123012301230. . . 
012340123401234012340. . . 
0014220310210210210. . . 
60110213021001 1021302100... 
0000 1111002211330022 110000... 
0123456789... 


(ultimate) Period 


nun 
Ne wWuUAbwWohn 


16. Subtraction games (WW, pp. 83-86 and 487-498) 


These are very simple examples of impartial games, played, like Nim, with heaps 
of beans. A move in the game S(s,,5,5,,..-.) is to take a number of beans from 
a heap, provided that number is a member of the subtraction-set, {5,5 .,53,-..). 
Analysis of these and other heap games is conveniently recorded by a nim- 
sequence: 


AgMNyNs***, 


meaning that the nim-value of a heap of h beans is n,, h=0,1,2,.... In this 
section, and often later, to avoid printing stars, we say that the nim-value of a 
position is n, meaning that its value is the nimber *n. 

Table 2 shows some examples: the first is a manifestation of She-Loves-Me- 
She-Loves-Me-Not; the last is Nim. If the subtraction-set is finite, the nim- 
sequence is (ultimately) periodic. But little is known about the length of the 
period vis-a-vis the membership of the subtraction set. 

In subtraction games the nim-values 0 and 1 are remarkably related by 
Ferguson’s Pairing Property (Ferguson 1974, WW, pp. 86 and 422): if s, is the 
least member of the subtraction-set, then 


Gny=1 just if Gn —s,)=0. 


Here and later, G(n) =v means that the nim-value of a heap of 2 beans is v. 


17. Take-and-break games (WW, pp. 81-106) 


Guy and Smith (1956) devised a code classifying a broad range of impartial games 
played with heaps or rows. If the binary expansion of the kth code digit, d,, in the 
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game with code 


d,-d,d,d,--- 


d= 2+ 2°64 2 $008, 


where 0<a,<b,<c,<---, then it is legal to remove k beans from a heap, 
provided that the rest of the heap is left in exactly a, or b, or c, or --- non-empty 
heaps. 

In order that the game should satisfy the ending condition, d, must be divisible 
by 4, ie., a, =2. 

Subtraction games are the special case d, =3 when s is in the subtraction-set, 
and d, = @ otherwise. 

Octal games are those with code digits ad, <7 for all k. Guy and Smith showed 
that an octal game is ultimately periodic with period p, i.e., Gn + p) = Gn) for all 
n>n,=2e+p+t, provided that Gn + p)= Gn) for n<n, apart from some 
exceptional values of », of which e¢ is the largest, and d, =0 for k >¢, i.e., the 
maximum number of beans that may be taken from a heap is ¢. Whether all such 
finite octal games are ultimately periodic remains a difficult open equation. They 
cannot be arithmetically periodic: that is, there is no period p and saltus s >0, 
such that G(n + p) = G(n) + s for all large enough n (WW, p. 114). 

Table 3 exhibits a dozen specimen games, of which the last three arc 
hexadecimal games with d, < 15 =F. Such games may be arithmetically periodic. 

Recently, Gangolli and Plambeck (1989) have established the ultimate period- 
icity of four octal games which were previously unknown. The game -16 has 
period 149459 (a prime!), the last exceptional value being (105350) = 16. The 
game - 56 has period 144 and last exceptional value 4(326639) = 26. The games 
* 127 and -376 each have period 4 (with cycles of values 4, 7, 2, 1 and 17, 33, 16, 
32) and last exceptional values $(46577) = 11 and (2268247) = 42, respectively. 

Grundy’s Game (Grundy 1939, WW, p. 111), split a heap into two unequal 
heaps, continues to defy complete analysis, despite Mike Guy’s calculation of the 
first ten million nim-values. Among these, 


0,1,6,7, 10, 11, 12, 13, 18, 19, 20, 21, 24,... 


occur quite rarely. When written in binary, these values contain an even number 
of ones after deleting the last digit. These rare values form a closed space (the 
sparse space) under nim-addition, while the complement forms the common coset: 


* * 
rare + rare = rare =common + common, 
* * 
Tare + common =common =common + rare. 


If the nim-values in a sequence begin to cluster in a suitable common coset, this 
clustering is likely to persist. In Kayles the rare and common values are evil and 
odious numbers, respectively, with an even and odd number of ones in their 
binary expansions. On the other hand, Dawson’s Kayles does not exhibit this 
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Some sample take-and-break games 


Code 


Game 


Nim-sequence 


“71 


“137 


-O77 


-3F 


Kayles. Knock down | skittle, or 2 contigu- 
ous skittles, from a row (Dudeney 1907, 
Loyd 1914). 


Dawson’s Chess. 3*n board. White and 
Black pawns on ranks 1 and 3. Capturing 
obligatory. Looks partizan but is not 
(Dawson 1934, 1935). 


Dawson's Kayles. Knock down 2 skittles, 
but only if they are contiguous. 


Officers. Take J counter from any longer 
row (Blanche Descartes 1953). 


Treblecross. One-dimensional _ tic-tac-toe 
(WW, pp. 93-94). 


Duplicate Kayles. Knock down 2 or 3 
contiguous skittles (Guy and Smith 1956). 


Double Kayles. Take up to 4 beans from a 
heap; leave rest in at most 2 heaps. (Guy 
and Smith 1956, WW, p. 98). 


See Kenyon (1967a,b). 
See Austin (1976). 


(First cousin of) Triplicate Nim. Take 1 
from heap, rest left in exactly 3 non-empty 
heaps. 


(F=15) Kenyon’s Game. Take 1 from 
heap or take 2 and leave rest in any 
number of heaps up to 3 (Kenyon 
1967a,b). 


(E = 14) Take 1, leave rest in just 1, 2 or 3 
heaps. 


Ultimate period p = 12, 412914721827 ex- 
cept for a =0, 3, 6, 9, IL, 15, 18, 21, 22, 
28, 34, 39, 57, 70, nim-value is 0, 3, 3, 4, 6, 
7, 3, 4, 6, 5, 6, 3, 4, 6, respectively. 


8112031103322445593301 130211045374 ex- 
cept 0 for n=0, 13, 34 and 2 for n= 16, 
17, 51. p=34. 


As for - £37, but shifted one term: 
0011203... in place of 011203... . 


No period found, (10342) = 256. 
No pattern yet found. 
p~ 24. Kayles with cach nim-value re- 


peated, 00112233114433.... 


p = 24. Kayles with cach nim-value g re- 
placed by the pair 2g, 2g +1 or 2g +1, 
2g (according to a certain rule), 
0123456732897654328945 ... . 

p = 349. 

p = 1550. 


Arithmetically periodic, p =3, saltus = 1. 
(00001 11222333444... .. 


p = 6, s = 3. 0120123453456786789 . . .. 


001234153215826514 .... G(246) = 128. 
No known pattern. 


Sparse-space phenomenon. In Grundy’s Game only 1273 rare values have 
appeared; the only one in the range 36184 <a < 107 is G 82860) = 108. If the rare 
values have indeed died out, then Grundy’s Game will ultimately be periodic, but 
the period may be astronomical. 

Amongst the comparative chaos, John Conway and Mike Guy have noted a 
remarkable structure in the nim-values for Grundy’s Game, related to the number 
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59. The probability that G(a + d) = G(r) is often as high as 4 
if d is near 59k and d=k mod3. 


Examples of these pseudo-periods are 58, 61, 116, 119, 122, 290, 293, 296, 360, 
412, S580, 583, 586, 589, 647, 650, 882, 952 and 1172, where the last four 
correspond to k = 11, 15, 16 and 20. 


18. Green Hackenbush (ONAG, pp. 165-172, WW, pp. 183-190) 


This ts played on a picture, as is Blue~Red Hackenbush, but now all the edges are 
grEen, and may be chopped by Either player, making it an impartial game. Every 
Green Hackenbush picture has a nim-valuc: for example (fig. 18) the value of a 
string of six green edges is *6. We will see how to evaluate Green Hackenbush 
trees by the Colon Principle and how to reduce any picture to a forest by the 
Fusion Principle. 

Green Hackenbush trees are examples of the ordinal sum (WW, p. 214) which 
can be defined for any two games: 


G:H=(G',G:H"|G",G:H*), 


where any move in G annihilates H, while a move in H leaves G unaffected. The 
Colon Principle (WW, pp. 184-185) states that H = K implies G: H = G: K, and 
in particular, that H = K implics G: H =G:K. That is, G:H depends only on 
the value of 17 and not on its form. It may depend on the form of G, because 
there are games G, = G, for which G,:H #G,:H. 

The Colon Principle applies at branch points of Green Hackenbush trees, 
allowing us to do nim-addition up in the air. For example, in fig. 16 at a, 
*34+%2=%, at b, ++*=0; and atc, * + *2=*3, so the value is the same as that 
of fig. 17, where, at d, #2 + *2 + * + #4 = #5, and the tree is worth #6. Notice the 
interplay of ordinary addition along strings, with nim-addition at branch points. 

Green Hackenbush pictures involving circuits can be evaluated by the Fusion 
Principle (WW, pp. 186-188): the value of such a picture is unaltered if you 
identify the nodes of a circuit. The edges of the circuit then become loops, which 


Figure 16. Figure 17. Figure 18. 
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Figure 19. Figure 20. Figure 21. 


may be replaced by twigs (compare figs. 20 and 21). Check that the value of fig. 
19 is *8. In this way, every component of a Green Hackenbush picture can be 
reduced to a tree, and hence to a string, and the strings are combined by 
nim-addition. 


19. Welter’s Game (Sprague 1947, Welter 1952, 1954, Berlekamp 1972, 
ONAG, pp. 153-165, WW, pp. 472-481) 


This is another game whose analysis involves the interplay of nim-addition and 
ordinary addition. It is a form of Nim with unequal heaps, but in order to keep 
track of empty heaps, only one of which is allowed, it is better to play it with 
coins on a strip of squares, numbered 0, 1, 2,.... with at most one coin on a 
square. A move is to shift a coin Jeftwards to any unoccupicd squarc, possibly 
passing over other coins. The game ends when the k coins are on the left-most 
squares 0, 1,...,& ~— 1. Figure 22 shows a position with k = 7. 


(6 ]@l@le[s lefe] # [@]s []n [12 fe fH ]i5| [17 [1s] 9]20f@|zz]e5 [24] 


Figure 22. The position {1,2,3,5.8, 13.21} in Welter’s Game. 


To calculate the nim-value, or Welter function, fa|b|c|...J, of the position 
with k coins on squares a, b,c,..., first note that for k = 1, [a] =a, and that for 
k = 2, {a|b] is one fess than the nim-sum of a and b: ¢.g.. [1|3} = 1, [5|6}=2. 
For more than two coins, mate the pair that is congruent modulo the highest 
power of 2 (it does not matter that this pair may not be uniquc). Remove the 
mated pair and find the best mated pair among the remaining &k —2 coins. 
Continue until all coins are mated, except, when k is odd, for one coin, the 
spinster, s. Then, if (a,b), (c,d),... are the mates, [a|b{c|d|...] is the 
nim-sum 


lalb) + {eld]i--- Gs), 


where the last term is included just if k is odd. 
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In fig. 22 the best mates are (5,21), then (1, 13), then (2,8), and 3 is the 
spinster, so the nim-value is 


[E]2{3|5]8] 13] 21] =[5| 21} + [1] 13] $ [218] 43 
=ISTINT9OF3=14. 


It turns out that [a[b]c|d]=0 just if the nim-sum a + b +c 4d =0, so Welter’s 
Game with four coins can be played with a Nim-like strategy. To play with three 
coins, imagine a fourth coin on an extra square —1, and add 1 to the numbers 
labelling the squares while you calculate your move. For example, (2,5, 8} is like 
{0,3.6,9}, where the winning move would be to {0,3,5,6}, so move to {2,4,5}. 

The mating method makes it casy to calculate the nim-value of a Welter 
position, but it is not so easy to find the good moves which make the nim-value 0. 
However, there is a remarkable connexion with frieze patterns (Conway and 
Coxeter 1973, WW, pp. 475-480), which work for nim-addition as well as for 
multiplication and ordinary addition, and which allow you (or your computer) 
both to calculate the value of the Welter function and to invert it. 

Start with a row of zeros above the Welter position that you want to evaluate, 
and manufacture a frieze pattern (so called because, when it is extended to the 
right, it eventually repeats periodically) by completing diamonds, 


b 
a d_ using the rule atd=(bic)tl, 
e 


so that c = [a|d] +b, where the sums arc still nim-sums. Lo and behold (fig. 23) 
the value of the Welter function appears at the foot of the pattern, as follows from 
a formula on page 159 of ONAG. 


Figure 23. Calculating the Welter function from a frieze pattern. 


If you want to change the value n = [a|b|c]|...] to some n' #n, then there are 
unique a’ £a, b'#b, c' #c,... such that 


fa'jb[c|...J=a'=[alb'|c]...J=[albfe'|...J=--- 


“hen fa|b\c\...J=” remains true if we replace any even number of a, b, 
-,n by the corresponding primed letters. This Even Alteration Theorem 
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(ONAG, pp. 160-162; WW, p. 477) may be written 


a\jbic|]  j_n 

a’ bic’ |" n' 
To find a’, b', c’,... corresponding to-a given n’, continuc the bottom row of the 
frieze pattern, n,n’, n,n',n,... alternately, then extend the pattern to the right, 


using the same diamond rule. You will discover that the defining row, a, b,c,... 
continues a’, b’, c',.... , 

In fig. 24 we find the good moves in the position {1, 2, 3, 5, 8, 13, 21} by 
choosing n’ = 0 and extending the pattern of fig. 23. If you extend it cven further 
to the right, you will see why it is called a frieze pattern. If you believe the 
algorithm, and read the second row of fig. 24, 


[ aH 3| 4 abe 2a 
15|0]37/35}10/ 11} 19} 0 


4 #90 4 ce) 4 oO 4 oO 


Figure 24. Inverting the Welter function using a frieze pattern. 


Check that each move leads to a Y-position. Some of the suggested moves, e.g., 1 
to 15, 3 to 37, are not legal, but, provided n’ <n, you will always find one that is; 
in fact there is always an odd number of legal good moves. Here there are three 
good moves: 


2to 0, 13 to 11, and 21 to 19. 


We can even give you a strategy for the mis¢re form (last player losing) of 
Welter’s Game, if your are willing to learn about Abacus Positions (WW, pp. 
478-481). 


20. Coin-turning games (WW, pp. 429-456) 


Several of the impartial games we have already mentioned, and a wide range of 
new games, can be realized by an idea of Hendrik Lenstra. Turning Turtles was 
originally played with turtles, but it is less cruel to play it with a row of coins (fig. 
25). A move is to turn a head to a tail, with the additional option of turning at 
most one other coin, to the left of it. This second coin may go from head to tail, 
or from tail to head. The game is over when all coins show tails, and the last 
player wins. 
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3 4 5 6 7 


Figure 25. A Turning Turtles position, with coins 3, 4, 6, and 7 showing heads. 


Ot a 


1 2 


We leave you to verify that this is a disguise for Nim: if you number the coins 1, 
2, 3,... from the left, then the value of coin n is «7 if it is a head, 0 if it is a tail. 
The value of a general position is the nim-sum of the heads. For example, the 
good moves in fig. 25 are to turn coin 6 to a tail; or to turn 7 to a tail and 1 toa 
head; or to turn 4 to a tail and 2 to a head. 

Mock Turtles is played in the same way, but a move may turn one, two or three 
coins, provided the rightmost turned goes from head to tail (this is to make the 
game satisfy the ending condition). We now number the coins from zero (the 
Mock Turtle) and find the nim-value (or Grundy function), (77), of the nth coin, 
when head up, to be 


n=0 


1234 5 6 7 8 9 10 11 12 13 14 15 1617 18..., 
Gn)=1 2478 11 13 14 


1 16 19 21 22 25 26 28 34 32 35 37.... 


1 


These are the odious numbers which we met as common vatues in Kayles. 


Gn) =2n or 2n+1. 


Yo find which, write ” in binary and append a check digit, 0 or 1, to make an odd 
number of 1-digits. 

Moebius, Mogul, and Moidores are the corresponding games in which a move 
turns up to 7 coins, where ¢=5, 7, and Y, respectively. We consider only odd 


values of ¢, because the Mock Turtle Theorem gives us the results for even values 
of ¢: 


Every nim-value for the 1=2m +1 game 
is an odious number. 


The corresponding value for the f= 2m game 
is got by dropping the final binary digit. 


The nim-values for coins 0 to 17 (when head-up) in Moebius are shown in table 
4. The structure of the ¥-positions in 18-coin Moebius is revealed on replacing 
the coin numbers by the labels in the third row. 


Table 4 
Hightecn-coin Mocbius gives the game its name 


Coin number 0 1 
Nim-value i 2 
Label « | 


2 2 4 5 6 7 8 9 1 TE 12) 13 1 15) 1a 17 
4 8 16 31 32 64 103 128 I7f 213 256 301 342 439 475 494 
40 -4 -1 5 6 ~-8 2. 23 5 8 3 7 7 6 2 
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Coins 0 to 5, with labels ©, 0, +1, +4, clearly form a Y- position (whichever 
ones you turn over, | will turn over the rest). Starting from this, or any 
?-position, we can find others by operating on the labels with any Mébius 
transformation (modulo 17): 


x+b . 
Se ed with ad — be =1. 
There are 1+ 1024+ 153+ 153+102+ 1 Y-positions 
with 0 6 8 10 12 18 heads, respectively . 


They correspond to the 2” codewords in the [18,9,6] extended quadratic 
residue code. If we drop the Mock Turtle (at <) we have the t=4 game on 17 
coins, whose ¥-positions correspond to codewords in the [17,9,5] quadratic 
residue code. 


Similarly if we play 24-coin Mogul (t= 7, turn up to 7 coins) we find 
1 + 759 + 2576 + 759+ 1 Y-positions 
with 0 8 12. 16 24 heads, respectively , 


coinciding with the 2'’ codewords of the extended {24, 12,8] Golay code. With 
t= 6 and 23 coins we have the perfect [23, 12, 7] Golay code (Curtis 1976, 1977). 

In the Ruler Game any number of contiguous coins may be turned (rightmost 
always going from head to tail). If the coins are numbered from 1, the nim-value, 
Gn), is the highest power of 2 that divides n 

In Turnips (or Ternups) a move turns three equally spaced coins. Number the 
coins from 0 and write 7 in ternary (base 3). G(n) is the Ath odious number if the 
last 2-digit in the ternary expansion is in the kth place from the right, or G(n) =0 
if there is no 2-digit. 

There is a plethora of such coin-turning games. They can also be played on a 
two-dimensional array of coins. For example, we can play the Cartesian product, 
A X B, of two one-dimensional games, in which a move is to turn all coins with 
coordinates (a@,, b;), where {a,} and {b,} are sets of coins constituting legal moves 
in games A and B, respectively. To satisfy the ending condition, the ‘most 
northeasterly” coin turned must go from head to tail (fig. 26). 


COOOO00OCO000S 
O@0@@000G@00@O 
COOCOOC0OS 
Q@OOO@®00@O 
OOODOOCCOO 
@OOCO0@00B8O 
OOO000000 
OO OOOO OOO 


Figure 26. A typical move in Mocbius < ‘Turnips, 


O 
Q 
O 
Q 
© 
)O 
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The nim-value of a (head-up) coin in such a game is given by the Tartan 
Theorem: 


The nim-values for the game A X B are the 
nim products of those for A and B: 


G,, gla, b)= G,(a) x G,(b) : 


where G,(a) is the nim-value of coin number a in game A, etc. and x denotes 
him-multiplication. Nim-multiplication (ONAG, pp. 52-53, Lenstra 1977) may be 
defined from the field laws (e.g., associativity and distributivity over nim-addi- 
tion), together with the rule 


nxN=nXN forn<N, 
NxN=3N/2, 


where N is any Fermat power of 2 (2,4, 16,..., ghee .). For example, 2 x 2 = 3, 
because 2 is a Fermat power, while 2% 3=2x(241)=332=1. And 


(3x 7)= (4X34 1 X7=(4X(2$¢ I) x 4FE2EDFET 
=4x4x (24 I $4X2K(2$ 1 FE4X (29 FT 
=6x (241) 44x (342) $4347 
=(442)x(2$I) $4 x14 1247 
=4x%2952x%2449 25475 11 =8 435 9=2. 


The assiduous reader will verify that the nim cube roots of 1 are 1, 2, and 3, and 
the nim fifth roots are 1, 8, 10, 13, 14. 


21. Lexicodes 


Conway and Sloane (1986) have noticed a striking connexion between the 
calculation of nim-values, involving, as it does, the mex of a set of non-negative 
integers, which is the jexicographically first number not in the set; and the 
construction of error-correcting codes by successively choosing the lexicographi- 
cally first codeword to satisfy the required distance, weight, and other conditions. 
Both processes use the greedy algorithm, while leads to some not-always-expected 
isomorphisms. 

Unrestricted binary lexicodes are linear (Marc Best 1975, unpublished). If the 
base is a power of 2, unrestricted lexicodes are closed under nim-addition. If the 
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Close connexions between coin-turning games and lexicodes 


Games (WW, chapter 14) 


Codes (Conway and Sloane 1986) 


Turning Turtles on 7 coins; Nim with heaps of 
up to 7 beans. 


Mock Turtles on 8 coins. 
The Mock Turtle Theorem. 
Moebius on 17 (18) coins 


Mogul on 23 (24) coins; automorphism group 
M,, (M,,); the MOG (Curtis 1976, £977). 


Moidores (¢ = 9) on 27 (31) coins. 


Welter’s Game; Nim with unequal heaps; 
connexion with frieze patterns. 


Mathematical Blackjack; Mathieu's 


The [7, 4, 3] Hamming code. 


The [8, 4, 4] extended Hamming code. 
Extending codes by a parity check digit. 


The (extended) [17, 9, 5} ([18, 9, 6]) quadratic 
residue code. 


The (extended) (23, 12,7] ([24, 12.8]) Golay 
code, (The Steiner system $(5,8,24). The 
Mathieu groups M,,, M,,. The Leech lattice.) 


Binary (27,9, LO} (31, 12, 10}) lexicode. 


Constant-weight binary lexicodes with distance 
4. 


_ The [12, 4} lexicodes of constant weight 6 with 


Vingt-et-un (Curtis 1984). Ryba’s restriction (sum of digits at least 21). 
(Steiner system S(5, 6, 12). The Mathieu 


group M_,.) 


base is a Fermat power of 2, i.e., on then unrestricted lexicodes are also closed 
under nim-multiplication. Table 5 gives a glimpse of the numerous connexions 
between impartial games and error-correcting codes, and with several other 
branches of combinatorics. 


22. The remoteness function (WW, pp. 259-266) 


Steinhaus (1925) gave an early analysis of impartial games, using what Smith 
(1966) has since called the remoteness function. This is useful for games where the 
idea of sum does not arise naturally, or does not arise at all (if a move may affect 
more than one component), We will later see how it serves to analyse joins of 
games in which moves must simultaneously be made in all components. It can 
apply to partizan games (where we define Left and Right remotenesses) and in 
places where we may want to amend the rules: for example, in misére play (where 
the last player loses), or where the ending condition does not hold (so that some 
remotenesses may be infinite). 

Intuitively, the remoteness is the number of turns that the game lasts when the 
winner is trying to win as quickly as possible, while the loser trics to postpone 
defeat as long as possible. In normal play the last player wins, so she wants to 
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move to a terminal position (remoteness 0), and, more generally, to a position of 
even remoteness, forcing her opponent always to move to positions of odd 
remoteness. Terminal positions have remoteness 0; positions with an option of 
remoteness 0 have remoteness 1; positions all of whose options have remoteness 
1, are of remoteness 2; and so on. 


Rules for calculating the remoteness 


If there is an option of even remoteness: 1 + least even remoteness N-position, 
Tf all options have odd remoteness: 1 + greatest odd remoteness = ¥-position; 
If there are no options: 0 terminal position. 


The -positions are those of even remoteness; the W-positions those of odd 
remoteness. If the game does not satisfy the ending condition, it may not always 
be possible to continue assigning remotenesses according to these rules. The 
temaining positions are ©-positions (open positions, for which neither player has a 
winning strategy), to which we assign infinite remoteness. 

Epstein’s (1967) Put-or-Take-a-Square game (WW, pp. 484-486 and 501-502) 
is played with a single heap of beans. A move is to add or take away the largest 
perfect square number of beans in the heap. The object is to take the last bean, so 
the empty heap has remoteness 0, and a positive perfect square number of beans 
has remoteness 1. Figure 27 shows some heap sizes with legal moves indicated by 
arrows. We can next assign remoteness 2 to heaps of 5 and 20, because both 
options have remoteness 1. Then 11, 14, 21, 30, 41, 54 have remoteness 3, since in 
cach case there is an option of remoteness 2 (and no option of remoteness 0). A 
heap of 29 has remoteness 4 because both options (4 and 54) have odd 
remoteness, and the larger is 3. 

Each of 2 and 3 is an option of the other, and sensible players will not choose 
the other options, | and 4, because these have (odd) remoteness 1 and lose 
(immediately). So best play goes 2, 3, 2, 3, 2,... and the remotenesses are 


infinite. Table 6 shows a few Y- and AW-positions, but the great majority of 
positions, 


2, 3,6, 7,8, 10, 12, 13, 15, 17, 18, 19, 22, 23,..., 


are ©-positions with infinite remoteness. 

The game of Fair Shares and Varied Pairs is played with heaps (WW, pp. 359, 
390). A move cither divides a heap into two or more equal-sized heaps or 
combines two different-sized heaps into a single heap. The terminal positions, of 
remoteness 0, are those with all heaps of size 1: i.e., the bottom row in table 7, 
where exponents indicate repetitions of heap size. They next row up shows 
positions with just one splittable heap (of size > 1): these have remoteness f. 

A dramatic change in the game occurs when we play with J1 beans. There is 
one #-position, ps ten W-positions, 7. im (m= 2,3,...,11), of remoteness 
1, and the other 45 partitions of 11, all those with two or more splittable heaps, 
are all O-positions. 


Simon Norton's game of Tribulations (WW, pp. 486 and 501-502) is similar to 
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i) 


Figure 27. Assigning remotencsses in Epsticin’s game. 


Epstein’s Put-or-Take-a-Square game, but with triangular numbers, 1, 3, 6, 10, 
15,..., used in place of squares. The largest possible triangular number is taken 
from or added to the heap. Norton conjectures that there are no O-positions, and 
that the W-positions outnumber the ¥-positions in golden ratio, 7 = (1 + V5)/2 = 
1.618; these conjectures have been verified for heap sizes of less than 5000. 

For Mike Guy’s game of Fibulations (similar to Simon Norton’s, but using the 
Fibonacci numbers plus one, 1, 2, 3, 4, 6, 9, 14, 22,...) the corresponding 
assertions can be proved, and indced there is a complete theory (WW, pp. 486 
and 501). 

John Isbell’s game of Beanstalk (Guy 1986) starts with a positive integer, 1. 
Moves to successive positions, n,, 1,,... are given by 

ny, =3n, +1 n, odd,  n,,,=4n, n, even. 
For John Conway’s game of Beans-Don’t-Talk, the rule is n,,, = (3n, + 1)/2*, 
where 2* is the largest power of 2 which will divide the numerator. The object in 
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both games is to move to 1. It is not known if there are any ©-positions in either 
game. The first few remotencesses are shown in table 8. 


Table 8 
Remotenesses for Beanstalk and Beans-Don't-Tatk 
r(B) 01725 8 6 3 Il 6 63 9 9 6 17 4 61 12 69 7 7 64 15... 
n, 123456 78 9 10 H1 12 13 14 15 16 17 18 19 20 21 22 23... 
“(BDT) 03°15 1 6 2 5 3°93 1 6 4 9 724 5 10 3 8 113 6... 


23. A dozen ways for simultaneous plays (Smith 1966, ONAG, pp. 173-187) 


So far we have combined games by their (disjunctive) sum, and usually assumed 
normal play, with the last player winning in the last component to end. But there 
are other ways to play several games simultaneously. 

When it is your turn to move, instead of playing in just ONE component, you 
might move in ALL of them, or in SOME (maybe all, but at least one). We will 
call these alternative combinations joins or unions, respectively, to distinguish 
them from sums. 

How do games end? For sums and unions, it is natural to continue so long as 
there is a component with a legal move. For joins, it is more natural to stop as 
soon as play ends in any component, since it is no longer possible to move in 
every component. However, it is possible to consider the opposite state of affairs 
in each case. These possibilities are summarized in table 9, with mnemonic hints, 
using the initial letters of the names of the games, for the methods of analysis. 


Table 9 
Six ways to combine and end games 
Name Compound Moves made in Finish determined when — Analysis 
Proper sum Disjunctive ONE ALL Plain nim-value 
(long) 
Quick sum Diminished ONE FIRST Queer, or foreclosed 
Disjunctive (short) nim-vadtie 
(ONAG, p. 178-179) 
Rapid join Conjunctive ALL FIRST Remoteness 
(short) 
Slow join Continued ALL ALL Suspense number 
Conjunctive (long) 
Tardy union Selective SOME ALL Tally (WW, p. 281) 
(long) (toll and timer) 
Urgent union Severed SOME FIRST Unrestricted taily 
Selective (short) 


component(s) component(s) finished 


And who is the winner? In normal play the last person to play wins: you lose if 
you are unable to move. But in misére play the last player to move loses. These 
two outcomes, combined with the six possibilities of table 9, provide a dozen 
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methods of conducting simultaneous play. We will do no more than comment 
brictly on a few of these. Indeed, we have no theory for misére unions of partizan 
games, and misére sums, for even quite simple impartial games, quickly get 
beyond our grasp. 


24. Misere Nim and an awful warning (ONAG, pp. 136-152, WW, 
pp. 393-426, Grundy and Smith 1956) 


Misére play of ordinary sums is very complicated, even in the impartial case. 
When Bouton (1901-02) analyzed (and named) Nim, he noted that his analysis 
also applied to the misére form. 

Play always to ¥-positions (nim-sum zero) until there is just one heap with 
more than one bean. Then take all, or all but one, of that heap, to leave an even 
number of singleton heaps in normal play, or an odd number of singletons in 
misére play. 

This simple rule has misled several writers into thinking that a similar device 
can be used in any impartial game. This is true for very very few games. For 
example, the known single-heap Y-positions in Grundy’s Game contain 0, 1, 2, 4, 
7, 10, 20, 23, 26, 50, 53, 270, 273, 276, 282, 285, 288, 316, 334, 337, 340, 346, 359, 
362, 365, 386, 389, 392, 566, 630, 633, 636, 639, 673, 676, 682, 685, 923, 926, 929, 
932, or 1222 beans, and there are probably no others; those in the misére version 
contain 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 39, 42, 45, 50, or 94 beans 
(Allemang 1984), and there may be others. 

Complete analyses of specific misére games are only known for tame games 
(WW, pp. 407-410). In a few sporadic cases it has been possible to exhibit a 
winning strategy without having to give a complete analysis. Notable examples are 
Lucasta (WW, pp. 556-563), Welter’s Game (WW, pp. 480-481), and Sibert’s 
recent analysis (Sibert and Conway 1992) of misére Kayles. 


25. Joins (WW, pp. 257-278) 


We can analyze (rapid) joins, partizan or impartial, normal or misére, using 
Steinhaus’s remoteness function. If the game is partizan, calculate the Left and 
Right remotenesses separately, using the remotenesses of the Right and Left 
options, respectively. For misére play, reverse the parity of the normal remote- 
ness rules (table 10). For impartial joins, omit all references to Left and Right, 
giving a single normal (or misére) remoteness function. Since a rapid join of 
games ends when the first component ends, the remoteness (Left or Right or 
impartial, normal or misére) of the RAPID JOIN is the 


LEAST REMOTENESS of any component. 


To win a rapid join, move to a position in which your opponent’s remoteness is 
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Table 10 
Remoteness rules for rapid joins 


In each component, to calculate the 


LEFT NORMAL RIGHT NORMAL LEFT MISERE RIGHT MISERE 
remoteness, take it as ZERO if there is no option for 
LEFT RIGHT | ; LEFT RIGHT 
Otherwise ADD ONE to the LEAST 

EVEN RIGHT EVEN LEFT ODD RIGHT ' ODD LEFT 
remoteness of any 

LEFT RIGHT LEFT RIGHT 

option which has such a remoteness, else ADD ONE to the GREATEST 

ODD RIGHT ODD LEFT EVEN RIGHT EVEN LEFT 
remoteness of any 

LEFT RIGHT LEFT RIGHT 


option which has such a remoteness. 


EVEN in NORMAL PLAY, ODD in MISERE PLAY. 


To play slow joins, in which play continues in the remaining components even 
though some have already ended, use the suspense number (WW, pp. 266-272), 
which has the opposite philosophy to that of remoteness: use cat-and-mouse 
tactics; if you are winning, spin the game out as long as possible; if you are losing, 
aim to get it over quickly. Convert table 10 to give suspense rules by interchanging 
GREATEST and LEAST throughout. The suspense of a SLOW JOIN is then the 


GREATEST SUSPENSE of any component, 


and, to win a slow join, again move to a position where your opponent’s suspense 
number is EVEN or ODD according as play is NORMAL or MISERE. 

We illustrate these ideas with an analysis of several variants of the game All the 
King’s Horses, taken, with kind permission of Academic Press and the authors, 
from chapter 9 of WW. This is played as a join: each player moves every horse in 
one or other of the two ways indicated in fig. 28. There can be arbitrarily many 
horses on a square and they are all moved by both players. A player is unable to 
move if there is any one horse he cannot move. Under normal play he then loses, 
but under misére play he wins. Left and Right remotenesses are calculated as 
shown in fig. 29. Table 11 shows the Left and Right remotenesses in (a) normal 
play, and (b) misére play. 

If we play All the King’s Horses as a slow join, so that the dast horse to reach 
home determines the outcome, we must allow a player to pass for a horse he 
cannot move: the game ends when all the horses reach home. Best play is guided 
by suspense numbers, shown in table 12. 

We can also play this version as an impartial game: cach player moves every 
horse whenever possible, using any of the four moves shown in fig. 28. Play is 
guided by a single suspense number, as in table 13. 

Another variant, a compromise between the rapid join and the slow join, is to 
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Figure 29. How remote is a horse? 


allow a player to pass for a horse he cannot move, provided his opponent can 
move it. Then, the first horse home determines the outcome, not the first horse 
stuck. 

In fig. 30, the misére remotenesses are being calculated. Left can move to a 
position of Right remoteness 3, so his remoteness is 3 + | = 4. Right has no move, 
but can pass to the same position with Left to move, so the Right remoteness is 
-4+1=5. Table 14 shows the remotenesses for this variant, and table 15 gives the 
(single) remotencsses when it is played impartially, each player having up to four 
possible moves for each horse. 


Combinatorial games 2153 


Table 13- 
Left and Right remotenesses in (a) normal, and (b) misére play (A = 10, B= 11, C = 12) 


(a) First horse stuck loses (b) First horse stuck wins 


Table 12 
Left and Right suspense numbers for a siow join: (a) normal play, (b) misére play 


(a) Last horse home wins (b) Last horse home loses. 


26. Unions (WW, pp. 279-306) 


Before we analyze unions, in which you are allowed to move in any positive 
number of components, remind yourself of the distinction between hot and cold 
games, which we illustrated with the hot game of Domineering and the cold game 
of Blue—Red Hackenbush. The following simple example, in which Left and 
Right each have a single option, a number, will suffice: 

If x<y, then the game {x| y} is cold, i.e., a number. 
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Table 13 
Impartial suspense numbers in (a) normal, and (b) misére play 


2.2 «35:3 I § 3 2 3 3 
De 2 ar 2 1 2 2 4 3 
3.93 3 3 242 44 4 
33.5 5 4323 65 
44 5 5 225 4 3 4 
44 5 5 434365 
SS. 852 2S: 463 6 5 6 
5 5 5 6 4 5 45 6 7 
(a) Last horse home wins (b) Last horse home loses 
OOOOH 2}1 2 
Q OO 0}2 3 
Figure 30. Right is a bit more remote because he is stuck. 
Table 14 
Left and Right remotenesses for a not so rapid join: (a) with normal play, (b) with misére play 
(A = 10, B= 11) 


12 45 56 56 56 
3434 «34 6778, 
42 42 56 56 56 
44 54 64 «74°«7 
45 66 16 86 76 
46 67 66 98 AB 
47 68 89 88 BO 


77 67 8A 9B AA 


(a) First horse home wins (b) First horse home loses. 


If x>y, then the game {x| y} is hot. 

If x =y, then the game {x| y} is tepid, in fact it is x + * or x*. 

The best strategy in playing a union of several components is to move in all the 
hot ones, and in none of the cold. The first phase of the play is like a join of the 
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Table 15 
Impartial remotenesses for the not so rapid join: (a) with normal play. (b) with mistre play 


lf de adh 36 43 1 ft 3 4 3 3 
Eb 2 2) Br 3 2) Si 2? 2: Ae $ 
Too) 3: 23) 3303 222 44 4 
13 3 33 5 2.3 4 5 6 5 
393.4 4 555 ae i ae ae 
3.93 4 4 5 § 45 4 7 67 
2°39 VS “35 465 6 7 6 
3 5 5 S$ 5 6 45 676 7 
(a) First horse home wins (b) First horse home loses 


hot components. Players will want to continue so long as there is a hot component 
left, so it is a slow join. When all components are cold, i.e., numbers, players will 
be reluctant to play and will move in only one component, the one where least 
harm is done. The union becomes like an ordinary sum. 


The selective theory of unions combines 
the disjunctive theory of sums with 
the conjunctive theory of slow joins. 


We indicate the state of play by a pair of (Left and Right) tallies (WW, pp. 
280-306). A tally consists of a toll, and a timer, written as a subscript. If Left (or 
Right) starts, the Left (or Right) toll is the (numerical) value that the component 
will acquire when the hot phase is over. The Left (or Right) timer is the number _ 
of moves that the hot phase lasts. Think of the timers as suspense numbers 
(though care will be needed in their calculation for tepid games, sec later). 

To find the Left (or Right) tally for the union of several components, add the 
Left (or Right) tolls and take the greatest of the Left (or Right) timers. 

The best Left option in a component is one with greatest Right toll, and, among 
those, one with largest even timer (or the /east if they are all odd). The best Right 
option is one with least Left toll, and, among those, make the same choice of 
timer (largest even, else least odd). 

To find the tallics for a component, suppose that the best Left option has Right 
tally x,, and the best Right option has Left tally y,. 

If x >y, the game is hot, and the tallies are x,,,y,,,- 

If x<y, the game has become cold, with tallies z,z), where z is the simplest 
number (with earliest birthday) in the interval 
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x<z<y if a and 6 are both even, even timers exclude their tolls, 
x <2 sy if a and b are both odd, odd timers admit their tolls 
X<z<y if a is even and b is odd, as candidates for 

x <2z<y if a is odd and b is even. the simplest number. 


If x=y, the game is tepid. Try x,,,y,,,: this is right if a+ 1, 6+1 are both 
odd, but if they are both even, replace them by 0 (the game has become cold). If 
just one of a + 1, b + | is even, increase the other (if necessary) by just enough to 
make it a larger odd number. 

There is no room to prove these rules, but we illustrate them with the game 
One for Left, Two for Right, Free for All, a modification of -007 (Treblecross; 
knock down three contiguous skittles from a row): made partizan (and more tidy) 
by allowing Left to knock down rows with just one skittle, and Right rows of two: 
and made selective by allowing players to attack as many rows as they like in a 
single move. 

In contrast to Treblecross, we can give a complete analysis of this game though 
you are forgiven if you do not immediately see the pattern in table 16. 


Table 16 
Tallies for One for Left, Two for Right, Free for All 
k Row of 
3k +1 3k +2 3k +3 

0 1 -1 0,0, 

I 1,1, 2,-1, 0,0, 
2 1,-2, 2,-1, 0 

3 1,1, 25.13 0,0, 
4 11, 24-1, 0,0, 
5 1,1, ach, 0,0, 
6 1,1, 2,-1, 0.0, 
7 1,1, 2,-1, 0,0, 
8 1,1, 21, 0.0, 
9 1,1, 24 A, 0,0, 
10 1.1, 2,-1, 0,0, 
1 1.1, 2,-1, 0,0, 
12 1.1, 2,—t,y 0,0, 
13 1.1, 2,1, 0,0, 


“e calculate the tallies for a row of 16 skittles, which a legal move may reduce 
~ of 


1,12 2,11 3,10 49 58 6,7 


v 


tallies (earlier entries in table 16) 


‘4,2, 71, 0,0,,1,1, 1,1,,0 2,~71,,2, -1, 0,9,,1, -2, 
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are the tallies with best Right toll from Left’s point, of view. They have no even 
timer: the least odd is 1. The best Left tolls for Right are 


ie ie i ¢ P45 1, tS 


and the greatest even timer is 2. Our rules show that {. .1,|1,..} is tepid and tell 
us to try 1,1,. This is correct: although there is an even timer, the other is odd 
and greater. 

The general pattern for the tolls is easy to see: the only exceptions are the Left 
toll for two skittles and the Right toll for seven. The pattern for the timers is 
complicated, and further obscured by 22 exceptions. Timers increase at three 
different rates, one twice another, the third only as the logarithm (to base 2) of 
the first. If /;, r, are the Left and Right timers for a row of 3k +i skittles, 1 = 1, 2, 
3, then 


L,=2U(k+3)/6]+4, k210,  r,=2[log,(k+2)}-1, k2=3, 
1, =2llog,(k+4)}-2, k22, r,=24(k +1)/3], k=5, 
1,=2Uk+3)/3}-1, k23,  ry=2L(k+5)/6] , k2u. 


We do not have a misére theory for tardy unions, but urgent unions can be 
dealt with, both for normal and for misére play, by introducing two new kinds of 
move: an overriding move which wins immediately; and a suiciding one which 
loses immediately; and we allow infinite tolls for such options. For precise details 
see WW, (pp. 292-306). 


27. Conclusion 


We have only scratched the surface of just a few of many aspects of combinatorial 
games. There is much more for the student to learn, and a great deal for the 
researcher yet to discover: a few of the many unsolved problems are listed in the 
Appendix. 


Appendix: Unsolved problems in combinatorial games 


1. Subtraction games are known to be periodic. Investigate the relationship 
between the subtraction set and the length and structure of the period. 
(Subtraction games are played with heaps of beans. A move is to take a 
number of beans from a heap, provided that number is a member of the 
subtraction-set. See Section 16, or WW, pp. 83-86 and 487-498.) 

2. Are all finite octal games ultimately periodic? Resolve any number of 
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outstanding particular cases, ¢.g., °6, °14, -36, -64, -74, -76, -004, -005, -006, 
-007 (One-dimensional tic-tac-toe, Treblecross), -016, -104, -106, -114, -135, 
+136, -142, -143, -146, -162, -163, -172, +324, -336, -342, -362, -371, -374, -404, 
“414, -416, -444, -454, -564, -604, -606, -644, -744, -764, -774, -776, and 
Grundy’s Game (split a heap into two unequal heaps). Find a game with a 
period longer than 149459. Explain the structure of the periods of games 
known to be periodic. If the binary expansion of the Ath code digit in the 
game with code d,-d,d,d,---is d, =2% +2" +2%+---, where 0<a, < 
b, <c, <-+-+, then it is legal to remove k chips from a heap, provided that the 
rest of the heap is left in exactly a, or b, or c, or... non-empty heaps. (See 
WW, pp. 81-115 and Gangolli and Plambeck 1989.) 

Examine some hexadecimal games. Obtain conditions for arithmetic period- 
icity. (Hexadecimal games are those with code digits d, in the interval from 0 
to 15. See WW, pp. 115-116.) 

Extend the analysis of Domineering to larger boards. For a modest begin- 
ning, find the values of 4 x 4 and 4 x 5 boards. See Berlekamp (1988), (WW, 
pp. 495-498), and section 9. (Left and Right take turns to place dominoes on 
a checker-board. Left orients his dominoes North-South and Right East- 
West. Each domina exactly covers two squares of the board and no two 
dominoes overlap. A player unable to play loses.) 

Analyze positions in the game of Go (compare Berlekamp 1988). 

Is Go-Moku (Five-in-a-Row, Go-Bang, Pegotty) a win for the first player? 


. Complete the analysis of impartial Eatcakes (WW, pp. 269, 271, 276-277). 


(Played with a number of rectangles, mm, * 2,5 a move is to remove a strip 
1 Xn, or m, X 1 from each rectangle, either splitting it into two rectangles, or 
reducing the length or breadth by one. Winner removes the last strip.) 
Compiete the analysis of Hotcakes (WW, pp. 279-282). (Also played with 
integer-sided rectangles. Left cuts as many rectangles vertically along an 
integer line as she wishes, and then rotates one of cach pair of resulting 
rectangles through a right angle. Right cuts as many rectangles as he wishes, 
horizontally into pairs of integer-sided rectangles and rotates one rectangle 
from each pair through a right angle.) 


. Develop a misére theory for unions of partizan games. (In a union of two or 


more games, you move in as many component games as you wish. In misére 
play, the last player loses.) 

Extend the analysis of Squares Off (WW, p. 299). (Played with heaps of 
beans. Move is to take a perfect square (>1) number of beans from any 
number of heaps. Heaps of 0, 1, 2 or 3 cannot be further reduced. A move 
leaving a heap of 0 is an overriding win for the player making it. A move 
leaving | is an overriding win for Right, and one leaving 2 is an overriding 
win for Left. A move leaving 3 does not end the game unless all other heaps 
are of size 3, in which case the Jast player wins.) 


. Extend the analysis of Top Entails (WW, pp. 376-377). (Played with stacks 


of coins. Either split a stack into two smaller ones, or remove the top coin 


12. 


13. 


14. 


15. 


16. 


18. 


19. 


20. 


21. 
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from a stack. In the latter case your opponent’s move must use the same 
stack. Last player wins. Do not leave a stack of 1 on the board, since your 
opponent must take it and win!). 

Extend the analysis of All Square (WW, p., 385). (Played with heaps of 
beans. A move splits.a heap into two smaller ones. If both heap sizes are 
perfect squares, the player must-move again: if he cannot he loses!) 
Extend the misére analysis of various octal games, e.g., Officers, Dawson’s 
Chess,..., and of Grundy’s Game see Allemang (1984) and WW, (pp. 
411-421). William L. Sibert has completed the analysis of misére Kayles, see 
Sibert and Conway (1992). 

Moebius, when played on 18 coins, has a remarkable pattern. Is there any 
trace of pattern for larger numbers of coins? Can any estimate be made for 
the rate of growth of the nim-values? (See section 20, and WW, pp. 432-435. 
Played with a row of coins. A move turns 1, 2, 3, 4 or 5 coins, of which the 
rightmost must go from heads to tails. Winner makes all coins tails.) 

Mogul has an even more striking pattern when played on 24 coins, which has 
some echoes when played on 40, 56, or 64 coins. Thereafter, is there 
complete chaos? (See references for problem 14. A move turns 1, 2,...,7 
coins.) 

Find an analysis of Antonim with four or more coins (WW, pp. 459-462). 
(Played with coins on a strip of squares. A move moves a coin from one 
square to a smaller-numbered square. Only one coin to a square, except that 
square zero can have any number of coins.) 


. Extend the analysis of Kotzig’s Nim (WW, pp. 481-483). Is the game 


eventually periodic in the length of the circle for every finite move set? 
Analyse the misére version of Kotzig’s Nim. (Players alternately place coins 
on a circular strip, at most one coin on a square. Each coin must be placed m 
squares clockwise from the previously placed coin, provided 7m is in the given 
move set. Complete analysis is only known for a few small move sets. 
Obtain asymptotic estimates for the proportions of W-, O- and Y-positions in 
Epstein’s Put-or-Take-a-Square game (WW, pp. 484-486). (Played with one 
heap of beans. At each turn there are just two options, to add or take away 
the largest perfect square number of beans that there is in the heap.) 
Simon Norton’s game of Tribulations is similar to Epstein’s game but squares 
are replaced by triangular numbers. Norton conjectures that there are no 
O-positions, and that the W-positions outnumber the #-positions in golden 
ratio. This is true up to 5000 beans. 

Complete the analysis of D.U.D.E.N.E-Y. (Played with a single heap of 
beans. Either player may take any number of beans from | to Y, except that 
the immediately previous move must not be repeated. When you cannot move 
you lose. Analysis is easy for Y even, and is known (WW, pp. 487-489) for 
53/64 of the odd values of Y.) 

Schuhstrings is the same as D.U.D.E.N.E.Y., except that deduction of zero is 
also allowed, but cannot be immediately repeated (WW, pp. 489-490). 
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ysis of The Princess and the Roses (WW, pp. 490-494). 
ps of beans. Take one bean, or two beans, one from each of 
neaps.) , 
snway’s and Paterson’s game of Sprouts with seven or more spots, 
aisére form with five or more spots (WW, pp. 564-568). (A move 
«wo spots, or a spot to itself by a curve which does not meet any other 
tor previously drawn curve. When a curve is drawn, a new spot must be 
placed.on it. The valence of any spot must not exceed three.) 

24. Extend the analysis of Sylver Coinage (WW, pp. 575-597). (Players alter- 
nately name different positive integers, but may not name a number which is 
the sum of previously named ones, with repetitions allowed. Whoever names 
1 loses.) 

25. Extend the analysis of Chomp (WW, pp. 598-599). (Players alternately name 
divisors of N, which may not be multiples of previously named numbers. 
Whoever names 1 loses.) 

26. Extend Uléhla’s or Berlekamp’s analysis of von Neumann’s game from 
diforests to directed acyclic graphs (WW, pp. 570-572, Uléhla 1980). 


Note added in proof 


The subject of combinatorial games is a young one, and rapid advances are being 
made. Since this chapter was first drafted, an A.M.S. Short Course was held in 
Columbus OH in August 1990, and an M.S.R.I. Workshop in Berkeley CA in 
July 1994. Serious students of the subject should consult 


Combinatorial Games, Proc. Symp. Appl. Math. 43 (1991), Amer. Math. Soc., 
Providence R.I. 


and the Proceedings of the Workshop, also to be published by the A.M.S. They 
should also know that Aviesri S. Fraenkel maintains an up-to-date bibliography 
which is obtainable from him at The Weizmann Institute of Science, Rehovot 
76100, Israel. : 

We list some recent advances. David Wolfe has found the values of 4 x 4 
and 4X5 Domineering boards. He and Berlekamp have made significant 
progress with the analysis of Go endgames: see Mathematical Go: Chilling Gets 
the Last Point, A.K. Peters, 1994. Allis, van den Herik and Huntjens have 
shown that Go-Moku is a win for the first player. Julian West found loony 
positions of 2403 coins, 2505 coins, and 33243 coins in the game of Top 
Entails. Thane Plambeck has applied Sibert’s method to obtain the misére 
analysis of some more octal games. Fraenkel, Jaffray, Kotzig and Sabidussi 
have a paper on Kotzig’s Nim. Marc Wallace and Alan Jaffray have made 
progress with the game D.U.D.E.N.E.Y. Daniel Sleator has pushed the normal 
analysis of Sprouts to 10 spots and the misére analysis to 8. For references, see 
Fraenkel’s Bibliography. 
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Introduction 


The history and development of combinatorics cannot be covered completely in a 
single chapter, partly because the explosive growth of the subject in the last forty 
years makes it impossible to give a definitive account of recent developments. 
Fortunately, the other chapters of this Handbook contain many historical remarks 
and from them a fairly clear idea of the main events in the recent history of the 
subject can be pieced together. Our aim here is to survey the field, tracing the 
development of major themes from the earliest times, and showing how current 
research has evolved from older problems. Incvitably some topics have been 
partly or completely overlooked, and some mathematicians have been slighted by 
the omission of their contributions. Nevertheless, it is hoped that an overview of 
the entire subject, from a historical point of view, will add some new insights to 
the story so comprehensively described in the rest of this Handbook. 


1. Combinatorics in antiquity 


It is strange that there is almost no material relevant to combinatorics in the 
literature of the classical Western civilizations. All the evidence points to the fact 
that the originators of the subject came from the East. The Chinese have a minor 
claim, through their interest in magic squares, but the main stimulus came from 
the Hindus. 

The study of ancient Hindu texts is a difficult subject. In many cases it is 
impossible to assign firm dates, or to separate the original text from later 
commentaries and embellishments, and consequently some modern Indian 
historians have made exaggerated claims for the priority of the Hindus in 
developing various branches of higher mathematics. However, it docs seem clear 
that the basic ideas of choosing and arranging were so intimately related to Hindu 
culture that a gradual mathematical development of these topics was inevitable. 
For example, the formula 


n(n — 1)(n —2)-+-3-2-1 
for the number of permutations of an n-set, and the formula 


n(n -1):--(n-k +1) 
k(k-—1)---2-1 


for the number of k-subsets of an n-set, were known to Bhaskara around 1150 and 
probably to earlier mathematicians such as Brahmagupta (sixth century). Special 
cases of these formulae may be found in texts dating back to the second century 
BC; further details are given by Biggs (1979). 

The magic square of order three, see fig. 1, may be reliably traced to Chinese 
writings of the first century AD, but claims that it was known in 2200 BC are 
unjustified. Its compelling fascination in times when even the simplest arithmetic 
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5 
i 
Figure 1. 


was a matter for wonderment can casily be imagined. There is no evidence that 
the Chinese made any further progress in the general study of magic squares until 
the period 900 to 1300 AD. During this time, the scholars of both China and 
Islam made extensive studies of magic squares, and were able to construct squares 
of any given order by a variety of methods. During the 13th century in particular, 
there was exchange of knowledge between these two cultures; this is exemplified 
by the discovery in 1956, on the outskirts of the Chinese city of Xi’an, of some 
iron tablets bearing 6 x 6 magic squares inscribed in East Arabic numcrals [sce Li 
and Du (1987) for an illustration of one of them]. The mystical overtones 
persisted, and survived the transmission of this knowledge to the West, through 
the Byzantine Greck mathematician Moschopolous, about 1315. 

Another fascinating combinatorial object which seems to have filtered west- 
wards around this time is the triangle of binomial numbers, see fig. 2. These 
numbers occur naturally in two contexts: as the number of subsets of a set (known 
to the Hindus, as already mentioned), and as the binomial coefficients which 
occur in the Hindu method for the extraction of roots. Thus the triangle itself may 
well have been known to the Hindus, although the carliest definite instances occur 
in 13th century works. Hughes (1989) has pointed out that Jordanus de Nemore 
(fl. AD 1225) discussed its construction and use in his De Arithmetica (Book IX, 
Proposition 70). It also occurs in Arabic works, such as that of al-Tusi in 1265 
(Ahmedev and Rosenfeld 1963). The same arrangement appears in Chinese texts 
around 1300, some of es indicate that it derives from writings (now, lost) of Jia 
Xian (= Chia Hsien = &) circa 1100 (see Li and Du 1987). 

Pascal’s famous eae *(1665) on what has come to be known as Pascal's 
triangle was thus by no means the earlicst work on the subject; a detailed history 
of the triangle is given by Edwards (1987). Pascal's treatise is distinguished by the 
fact that it gives a ‘““modern” deductive treatment of a range of topics related to 
the binomial numbers, and uses a form of the principle of mathematical induction. 
One notable stimulus for Pascal’s work was the problem of predicting results of 
games of chance. Indeed, such problems provide an important link between the 
scholars of mediaeval times and modern mathematicians, and this link is 


1331 
14641 


Figure 2. 
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especially relevant in the field of combinatorics. An excellent account may be 
found in a book by David (1962). 


2. Origins of modern combinatorics 


Pascal’s Traité (1665) may be said to mark the beginning of combinatorics as we 
know it today. To a lesser extent Leibniz should also be given some credit for 
originating several of the main strands of the subject. His Dissertatio de Arte 
Comhinatoria (Leibniz, 1666), deals only marginally with combinatorics, and is 
based on work of Lull and others (Knobloch 1979). But we owe to him the 
suggestion (in a letter to Johann Bernoulli) that it would be rewarding to study 
the partitions of integers, and although he published little on this subject there are 
many unpublished manuscripts of his which deal with it (Knobloch 1974). This 
question was later taken up by Euler, who used some of Leibniz’s ideas; another 
letter of Leibniz, written to Huygens in 1679, contains a rather vague reference to 
a “geometry of position’. When Euler solved the problem of the Konigsberg 
bridges (section 5), his friend Ehler (sce Sachs et al. 1988) pointed out to him that 
his work was relevant to the ideas of Leibniz, and Euler subsequently mentioned 
this in his paper (Euler 1736). [n 1833, Gauss referred to the geometry of position 
as a neglected subject, to which only Euler and Vandermonde had given a ‘‘feeble 
glance” (Gauss 1867, p. 605). 

It was during the 17th century that advances in algebraic notation led to a 
clearer understanding of a fundamental link between algebra and combinatorics: 
the observation that the expansion and collection of terms in a product of 
algebraic expressions corresponds to the listing of combinatorial objects of a 
certain kind. (For example, as already noted in section 1, the binomial expansion 
can be interpreted as a rule for finding the number of ways of choosing k objects 
from a set of size n.) This idea was known to Pascal and Leibniz, and a version of 
it has been ascribed to the English mathematician Harriot, around 1600. De 
Moivre (1697) carried the idea a step further when he proved the multinomial 
theorem, giving the rule for finding the coefficients in the expansion of 


x x paki ae, & . 
i+ 2+ + )" 


Another of his discoveries was a form of the principle of inclusion and exclusion, 
which he used (de Moivre 1718) to derive the formula 


D,=n! s cat 


r=0 r. 


for the number D, of derangements of n objects. This result had been obtained 
previously in other ways by Nikolaus Bernoulli and Montmort; see Takacs (1981) 
for a detailed discussion. An account of the life and work of de Moivre is given by 
Schneider (1968-9). 

Of course, the contributions of Euler overshadow everything else in the 18th 
century. His seminal ideas in graph theory will be noted in section 5, but he also 
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worked in the areas of combinatorics related to partitions and latin squares. In the 
first of these areas he made remarkable progress, using the algebraic technique 


mentioned in the previous paragraph. In his book (Euler 1748) he considered the 
product 


(+ xz) + xP z)b + x%z)1 + x?z)---, 


where A = {a, B, y, 6,...} is a set of distinct positive integers. Each choice of m 
integers from the set A summing to # is a partition of 2 into m distinct parts, each 
of which is in A. Hence the coefficient of x"z” in the product is just the number 
of such partitions. Similarly, if @ may occur more than once as a part, the factor 
(1 +x"z) is replaced by (1 — x“z)~', since 


(-x"z)'=14¢x% 4 (322? +O y2 toe, 


and the term (x“)'z" corresponds to r occurrences of @ in the partition. In 


particular, this leads to the formula now known as the partition-generating 
function: 


> p(ajx" = al (=-x")'," 
n=O net 
where p(n) is the total number of partitions of n (see section 4). In this way Euler 
obtained many interesting identities concerning infinite products and infinite 
series; another of his examples is given in section 4. He also studied the 
relationship between partitions and symmetric functions. 

Euler’s interest in latin squares was more transient. In a famous paper (Euler 
1782), he posed the following problem: 


“If there are 36 officers, one of each of six ranks from each of six 
different regiments, can they be arranged in a square in such a way 
that each row and column contains exactly one officer of each rank 
and one from each regiment?” 


In modern terminology the conjecture is concerned with the existence. of a pair of 
orthogonal latin squares of order 6 (see section 6). Euler was unable to find a 
solution and he conjectured that no solution exists, not only in the case n = 6 but 
generally when n = 2 (mod 4). A solution for n = 4 corresponds to a well-known 
arrangement of the sixteen “court” cards in a standard pack of playing cards, and 
had been published much earlier, in, for example, Ozanam’s Mathematical 
Recreations (1725). Euler generalized this arrangement and constructed solutions 
for many other values of n. (See section 6 for later progress on this topic.) 

The formal methods which Euler developed in the study of partitions were 
developed by Hindenburg (1796) and his collaborators. They used a notation for 
dealing with symmetric functions and related topics which was so complicated that 
their work has not been much studied by later scholars. Around the same time, 
practical mathematicians began to use combinatorial ideas in everyday problems. 
For example, Peter Nicholson (architect, carpenter, builder and “private teacher 
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of the mathematics”) published 250 pages of Essays on the Combinatorial 
Analysis (1818), and this is probably the first book in English on the subject. 

A more profound influence on the mathematical development of combinatorics 
was the study of permutations and their algebraic properties. In describing the 
properties of what are now called groups of permutations, Lagrange, Galois and 
Cauchy opened up the way for the eventual integration of the subject into the 
mainstream of modern mathematics. For example, Cauchy (1815) was probably 
the first to prove formally that exactly half of the permutations of n objects can be 
expressed as the product of an even number of transpositions. But special cases of 
this result had been known to English church bellringers for well over a century 
before that, because the curious rules governing the ringing of changes inevitably 
lead to a study of transpositions (White 1983). 


3. Formal methods of enumeration 


Problems of enumeration go back to antiquity, but many were solved by ad hoc 
methods (if they were solved at all), and it was not until about the end of the 17th 
century that systematic methods of solution began to be developed. In his work 
on the multinomial theorem (see section 2), de Moivre (1697) also discussed the 
reversion (compositional inversion) of series (see below). Later (de Moivre 1718), 
he used generating functions to solve what are now called homogeneous difference 
equations with constant coefficients, i.e., relations of the form 


+--+ aa 


a-2 rn? 


a, = aa 


n 


a-1 + Oa 
where the a, are constants. The term recurrence is also due to de Moivre (1722), 
and to him must be ascribed the first general study of the subject. Of course, 
special cases had been studied before, for example by Leonardo of Pisa 
(Fibonacci) in his Liber Abaci (1202). Fibonacci’s problem concerned the 
breeding of pairs of rabbits, and this led him to a sequence in which each 
successive term is the sum of the two previous terms. He remarked that this rule 
enables the sequence to be continued indefinitely but, not surprisingly, he did not 
give an expression for the general term in the sequence. The terms in the 
sequence are now called Fibonacci numbers. If the nth term is denoted by f,, the 
tules for forming the terms are (in modern notation) 


f=, h=t, Leh ath 1, a3. 
De Moivre (1730) gave the explicit solution 
f= (C4 + 35)" ~~ $V5)"}/V5. 


The use of generating functions in enumeration requires the manipulation of 
power series, and many results in the 18th century were obtained by purely formal 
methods. It was taken for granted that a formal power series defines a function, 
and the series were manipulated without regard to questions of convergence. 
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Some of the techniques are still remembered today, but others have now been 
largely forgotten, being swamped in the wake of classical analysis as it gathered 
momentum in the 19th century. For example, a problem of great importance in 
manipulating series is that of reverting power series, that is, given the equation 


z=tt VP +VyP FV tee, 
it is required to express ¢ in the form 
6=24+Wyz> + Wiz +Wyziteee. 


There are several ways to do this; de Moivre (1698) obtained results on this 
problem and there is also much interesting material in Arbogast’s book (1800). 
The fatter’s material was used by some later mathematicians, including Cayley 
and Glaisher, but it is barely remembered today. One method which is still used is 
the Lagrange inversion formula (Lagrange 1770) which states that if the co- 
efficients U,, ,, are defined by 


aan 


(14 VyattVy +--+) "=U, 44+U 


al 


rtU Pte, 


a 


then the coefficients W, are related to them by 


W, =U, ,,.\/n. 

In a series of papers in the 19th century, Blissard (a country clergyman) used 
techniques quite alien to the thinking of analysts. These methods included 
expanding series and then at suitable points replacing powers by subscripts; 
similar techniques were also used by Lucas (1877). The Blissard (or umbral) 
calculus, as it was later called, was largely ignored except by a few devotees, 
including Bell and Riordan, but in recent times Rota and his school have been 
putting the subject on a more rigorous basis — see, for example, Roman (1984). 
Bell (1938) gave an account of the method, together with details of (and 
references to) Blissard’s life and work. ; 

The functions now known as permanents first appeared in the literature in 
papers of Binet (1812) and Cauchy (1812); a detailed account of the subject, 
including its history, is given by Minc (1978, 1983). Many of the early papers 
involved identities between determinants and permanents, but eventually it was 
realized that a number of cnumeration problems can be stated in terms of 
evaluating the permanents of various (0, 1)-matrices. For example, if F= 
(A,,...,A,,) is a finite family of subsets of the finite set S = {s,,...,5,}, then 
the number of transversals of F (sce section 7) is equal to the permanent of the 
incidence matrix P=(p,,) of the system, where p, =1 if s,;€A;, and p, =0 
otherwise. Unfortunately, the evaluation of permanents is a hard problem (sce 
chapter 29), so some of this work is only of theoretical interest. 

Another type of problem which can be expressed in terms of permanents is that 
in which it is desired to tind the number of rearrangements of an ordered set of 
clements, subject to certain specified restrictions as to which elements may go in 
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which positions. Special cases-.of such problems may be solved without reference 
to permanents, and include the derangement problem (probleme des rencontres) 
mentioned in section 2, and the probléme des ménages. The history of the latter 
problem is discussed by Dutka (1986). As mentioned in section 2, one technique 
for solving such problems is the principle of inclusion and exclusion, which is an 
example of a sieve method — a technique in which the elements of a set larger than 
the one of interest are first listed or counted, and then various subsets are either 
deleted or added until the set required, or the number of elements in it, is 
obtained. One such method is used in number theory and employs a Mébius 
function delined on the positive integers. In the mid-1930s the concept of a 
Mobius function was extended to other posets independently by Weisner and P. 
Hall and, a few years later, generalized by Ward. The idea was taken up 
enthusiastically by Rota (1964) - sec chapter 21. 

In the late 19th and early 20th centuries, major contributions were made to 
enumeration by MacMahon. There ts insufficient room here to describe all his 
contributions but further details may be found in his books (MacMahon 1915-16) 
and collected papers (MacMahon 1978/86) as well as in the commentaries 
provided by Andrews in the fatter works. One result which deserves mention, 
however, is the muster theorem. 


MacMahon’s Master Theorem. Lef x = (x,,X2,..-, eye and y=(Yis Va, - +s y,)! 
be column vectors connected by the matrix equation y = Ax, where A isannXn 
matrix. Also let A=det(I1—AX), where X is the diagonal matrix 
diag(x,,%2,.-.,%,), Then the coefficient of 


Ia 
n 


Ity? 

x Xx A Se # 

‘in the expansion of 
VEY Ye 

is equal to the coefficient of 


XNx? sre aie 


in the expansion of 11. 


MacMahon applicd this theorem to various problems such as counting permuta- 
tions of multisets in which no element remains in its Original position; see also 
Cartier and Foata (1969). 

There are many enumeration problems in which the equivalence of structures is 
defined in terms of the action of a finite group of symmetries. The history of the 
following orbit-counting theorem, sometimes incorrectly Known as Burnside’s 
lemma or, More appropriately, as the Cauchy—Frobenius lemma, is discussed in 
some detail by Neumann (1979). The theorem lies at the basis of many results on 
enumeration under group action. 
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Orbit-Counting Theorem. /f the elements of a finite group G act as permutations of 
the elements of a finite set D, then the number of orbits under the action is given by 


1 
Tat 2 |fix(g)} , 


where fix(g) is the subset of elements in D which are fixed by the action of g, and 
|X] denotes the number of elements in X. . 


The theorem was used by Redfield, who, inspired by some of MacMahon’s 
results, did important work in this area. Unfortunately, his remarkable paper 
(Redfield 1927) remained almost unnoticed until about 1960, but since then it has 
come to be regarded as an outstanding contribution to the subject. Redficld 
introduced some polynomials which he called group reduction functions, and 
performed various operations upon them in order to solve a number of counting 
problems. He stated a theorem, now known as the Redfield~Read superposition 
theorem |since it was discovered independently by Read (1959)], which can be 
used in a great variety of cnumeration problems. In the 1980s, valuable 
unpubdlished work by Redficld came to light; sec Lloyd (1988) for further 
information. Biographical details of Redficld are given by Lloyd (1984). 

In the late 19th century, both chemists and mathematicians began to study 
problems of counting chemical isomers (see chapter 38). In particular, Cayley, in 
his work on trees (see section 5) considered the alkane series C,,H,,,,,. The early 
methods for isomer enumeration were rather cumbersome and many errors 
appeared in the early work, but in the 1920s mathematical tools were developed 
which led to significant progress. Had it been read and understood at the time, 
Redfield’s paper (1927) might well have been a major influence in this area, but 
only in the 1980s did chemists begin to realize the usefulness of his methods. 
More influential were the works of Lunn and Senior (1929) and Polya. Pélya’s 
ideas were set out in a number of papers in the mid-1930s, culminating in a 
lengthy and famous paper (Pélya 1937) translated into English half a century later 
(Poélya and Read 1987). Pélya combined the use of generating functions with the 
Orbit-Counting Theorem and he applied his methods to counting various graphs 
and chemical compounds. Some of his ideas had already appeared in Redficld’s 
paper (in particular, Pélya’s cycle index is Redfield’s group reduction function), 
and these ideas were further developed by de Bruijn. A detailed discussion of the 
interrclationships of work in this area is given by Read (1968); sce also Read's 
chapter in Polya and Read (1987). 

The present state of the art of enumeration is described in chapters 21 and 22. 


4. Partitions and symmetric functions 


After Euler’s work on partitions and that of Hindenburg and his colleagues, both 
mentioned in section 2, there was little further progress until the 1840s, when a 
number of mathematicians began to look at the subject. In particular, Warburton 
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sought a method for determining the number of partitions of a given number, and 
he communicated some of his results to Dc Morgan. The fatter presented an 
account of Warburton’s work to the Royal Socicty in 1847 and, soon afterwards, 
Warburton himself published a paper (1842-9) on the subject. Although the 
paper did not contain outstanding results, according to MacMahon (1896-7) it did 
have the effect of bringing the subject to the attention of mathematicians such as 
Herschel (1850). There are numerous recurrence relations between the numbers 
P(n) of partitions of 1 into r parts, and De Morgan, Warburton and Herschel 
attempted to solve such relations. Herschel expressed: P_(7) in terms of certain 
functions known as circulating functions. 

Circulating functions were also used by Cayley and Sylvester, who (unlike 
Herschel) started with generating functions and sought an expression for the 
coefficient of the general term. Sylvester’s work on partitions spanned many 
years, during which time he was often sidetracked by other researches, but by 
making use of Cauchy’s work on the theory of residues, he obtained and 
published an expression for the coefficient of x" in an arbitrary rational function 
(Sylvester 1855-7), although it was many years before he published a proof. This 
work was described by MacMahon (1896-~7) as ‘“‘incomparably the finest contribu- 
tion that has ever been made to combinatory analysis”. Glaisher (1875, 1909) also 
worked on partitions rather intermittently, but his approach was rather different 
from earlier writers, and he made use of the methods of Arbogast (1800) for 
calculating coefficients in series expansions. 

In contrast to the analytical methods of Sylvester, elegant proofs were obtained 
by using diagrams first published by Sylvester, but attributed by him to Ferrers. If 
the parts A, of a partition are arranged in non-increasing order (A, 2A, 2=-°+ 2 
A, > 0), then the Ferrers diagram is obtained by placing A, marks, such as dots or 
ones, in the rth row of the diagram. For example, the diagram for the partition 
3+3+2+1 is shown in fig. 3. 

A rather remarkable result on partitions states that 


(= = 2120 y= Daye 


aed 


This is equivalent to the statement that except for numbers m of the form 
$n(3n + 1), the partitions of m into an even number of parts are equinumcrous 
with those into an odd number of parts, and in the exceptional cases the two 
numbers differ by 1. This result was observed by Euler at an carly stage in his 
work on partitions, but he was unable to prove it until some years later (Euler 
1751). Franklin (1881) gave a very neat proof of it using Ferrers diagrams: his 


oT 


Figure 3. 
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proof involves moving some of the dots in the diagrams so that an cxplicit 
bijection is obtained between certain classes of partitions. Durfee (1882-3) also 
used Ferrers diagrams: he expressed cach diagram as the union of the largest 
square of dots in the diagram, now known as the Durfee square, and two other 
partitions. This idea enables combinatorial proofs to be gwen of certain partition- 
generating function identities. 

The number p() of partitions of increases rapidly with . Values were 
calculated by Euler to at least 1=69, but a longer table, going as far as 
p(200) = 3 972 999 029 388 was calculated by MacMahon and included in the truly 
remarkable paper of Hardy and Ramanujan (1918). In that paper, the authors 
obtained an almost unbelievable result which expresses p(/1) as the nearest integer 
to an expression involving an assortment of square roots, derivatives, exponen- 
tials and (24q)th roots of unity. Studying MacMahon’s table, Ramanujan (1919) 
was able to prove a number of congruence properties of p(). For example, he 
showed that p(5n + 4) =0 (mod 5) and p(7n + 5) =0 (mod 7). These are special 
cases of a more general conjecture which he made but which eventually proved 
not to be correct. The reader is referred to Andrews (1976) for further details. 

The idea of a partition can be extended to higher dimensions and much work 
was done in this area by MacMahon; sec, for example, MacMahon (1915-16). If 
the dots in a Ferrers diagram are replaced by positive integers subject to the 
restriction that numbers are non-increasing along each row and also down each 
column, then the set of integers is a plane partition of their sum. For example, the 
diagram in fig. 4 is a plane partition of 26. Further details are given in chapter 21. 

Hammond (1882, 1883) introduced some differential operators which act on 
symmetric functions, and these were enthusiastically employed by MacMahon to 
process the symmetric functions upon which much of his extensive researches in 
combinatorics were based (MacMahon 1915-16, 1978/86). More recently, David 
et al. (1966) used Hammond operators to calculate tables of symmetric functions. 

In the first of a serics of nine papers written over many years, Young (1901) 
introduced the idea of a tableau. Tabicaux correspond to special plane partitions 
in which the parts are consecutive integers 1,2,...,m, for some m, and although 
Young introduced them in the context of invariant theory, Frobenius (1903) 
pointed out that Young’s methods are closely related to his own work on group 
representation theory. A problem considered in that subject is to obtain 
information about the representations of a given group in terms of representations 
of suitably chosen subgroups. For the symmetric group S,, the appropriate 
subgroups are those of the form 


= 
Se Bg RSG where 2 A, =n. 
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Figure 4. 


The history of combinatorics 2175 


By combining his ideas with contemporary work of Frobenius and Schur, Young 
obtained a complete classification of the irreducible representations of S, over the 
complex numbers. Young’s papers are not casy to read, but in recent years the 
influence of his work has been felt in a number of areas. His collected papers have 
been published (Young 1977) and in a review of that work, Andrews (1979) lists 
121 papers which cite Young and which were published within the previous fifteen 
years. 

An algorithm of Robinson (1938), rediscovered by Schensted (1961), estab- 
lishes bijections between certain sets of matrices and certain tableaux. This 
algorithm enabled Schensted to produce extensions of a famous result of Erdés 
and Szekeres (see section 7), such as the following. 


Theorem. Zhe number of permutations of 1,2,...,m, with longest increasing 
subsequence of length c and longest decreasing subsequence of length r, is equal to 
Mie f a where f, is the number of standard Young tableaux of shape p, and the 
summation is over all partitions ju of m with c parts and largest part r. 


The invariant theory of binary forms is a subject which flourished in the 19th 
century and, in particular, Sylvester tricd to link it with chemistry (see section 5). 
Since then it has been pronounced dead on many occasions, but it persists in 
coming back to life and, in the words of Kung and Rota (1984), “the artillery of 
combinatorics” is now being aimed at the subject. The reader is referred to that 
paper for further information and references. 


5. The development of graph theory 


[A fuller account of this topic appears in the book Graph Theory 1736-1936 
(Biggs, Lloyd and Wilson 1976) which includes extracts (translated into English 
where necessary) from many of the works cited below. Such a work is annotated 
below in the form (BLW nX), meaning that an extract from it appears as extract 
X in chapter n of the book.|} 

The subject of graph theory originated with Euler's solution of the Kénigsberg 
bridge problem, which asks for a route crossing each of the seven bridges of 
Konigsberg just once. On 26 August 1735, Euler presented a paper to the St. 
Petersburg Academy of Sciences, proving that the problem is impossible, and 
showing how his method can be extended to any number and arrangement of 
islands and bridges. In particular, he formulated necessary and sufficient con- 
ditions for the existence of a route crossing every bridge just once, but he seems 
to have considered it unnecessary to prove the sufficiency in general; the first 
valid proof of this was given by Hierholzer (BLW 1B). 

Euler’s paper on the KéGnigsberg problem was written in 1736, and first 
published in 1741 (BLW 1A), but initially it aroused little interest. Although his 
methods were essentially graph-theoretical, he did not use graphs as such, and the 
graph usually used to solve the problem seems not to have appeared until 150 
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years later (Rouse Ball 1892). Indeed, the problem was not well known until the 
end of the 19th century, when Lucas (1882) and Rouse Ball (1892) included it in 
their books on recreational mathematics, although a French translation of Euler’s 
paper had been published carlier by Coupy (1851). 

A recreational puzzle related to the Kénigsberg bridges problem is that of 
finding the smallest number of pen-strokes needed to draw a given diagram with 
no line repeated. Poinsot (1810) showed that a single stroke is sufficient for an 
odd number of points all joined in pairs (the complete graph K,,, n odd), but not 
for an even number of points. Such problems were also discussed by Listing in his 
Vorstudien zur Topologie (1847; BLW IC), the first publication to use the word 
topology, but the connection between them and the KGnigsberg problem seems to 
have remained unnoticed until Lucas’s book. The term Eulerian graph for a graph 
which can be drawn with a single pen-stroke is due to K6nig, and appeared in his 
pioneering book (K6nig 1936). The origins of topology are discussed by Pont 
(1974). 

Another type of traversal problem involves finding a cycle through every vertex 
of a graph, rather than a route passing along cach edge, as above. An example of 
such a problem is the knight's tour problem which asks for a sequence of knight’s 
moves on a chessboard, visiting every square and returning to the starting-point. 
Although solutions of this problem had been known since the 14th century, the 
problem was not subjected to mathematical analysis until 400 years later, with 
papers of Euler (1759) and Vandermonde (BLW 2A). The first general discussion 
of vertex-traversal problems was given by Kirkman (BLW 2B) who asked which 
polyhedra allow a cycle passing through each vertex, and described a general Class 
of polyhedra for which no such cycle exists. Graphs which allow such a cycle are 
now called Hamiltonian graphs. Hamilton became fascinated by paths and cycles 
on a dodecahedron (BLW 2C) as an offshoot of his own work on non-commuta- 
tive algebra, and of his icosian calculus in particular. Such considerations gave 
rise to a recreational puzzle, the fcosian Game, in which the object was to find 
paths and cycles on a “‘flat dodecahedron”’, satisfying certain specified conditions. 
A version of this game played on a solid dodecahedron was later marketed under 
the name A Voyage round the World. 

Unlike the Eulerian problem, where necessary and sufficient conditions for the 
existence of a trail are easy to find, the general Hamiltonian problem has 
remained intractable, and the problem of determining whether a given graph is 
Hamiltonian has been shown to be NP-hard; sce chapter 29. However, there are 
several necessary conditions or sufficient conditions for a graph to be Hamilto- 
nian, such as the sufficient conditions of Dirac (1952), Ore (1960) and others, and 
a result of Fleischner (1974) that the square of any 2-connected graph is 
Hamiltonian. Further results on Hamiltonian graphs may be found in a survey by 
Bermond (1978) and in chapter 1. 

Whereas Eulerian and Hamiltonian graphs arose out of recreational puzzles, 
the study of trees emerged from a problem in the differential calculus. In 1857, 
Cayley (BLW 3A) expressed the problem in terms of rooted trees and used 
generating functions to determine the number of such trees with a given number 
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of edges. Although the concept of a tree had been used implicitly ten years earlier 
by von Staudt and by Kirchhoff, Cayley was the first to use the term in print. 

Cayley’s methods for enumerating trees were initially very cumbersome, but 
Jordan (BLW 3B) simplified the procedures substantially by introducing the 
concepts of centroid and bicentroid, and centre and. bicentre, for a given tree. 
Using them, Cayley (BLW 3C) was able to count trees by starting from their 
centre or centroid and working outwards. In this way he counted both unrooted 
trees and various types of chemical molecules, such as the family of alkanes 
C,,H,,.. (BLW 4B). Closely involved with Cayley’s work were Sylvester and 
Clifford. Both made important contributions to the study of invariants, and 
Sylvester wrote a note (BLW 4C) and a long paper (Sylvester 1878) aiming to link 
invariant theory with chemistry by describing an analogy between binary quantics 
and chemical atoms. To represent this connection diagrammatically, Clifford 
introduced graphic notation, or graphs for short, and Sylvester’s note used the 
word graph (in the graph-theory sense) for the first time; see also chapter 38 for 
further historical remarks on this topic. 

Other tree-counting problems were solved later. In 1889, Cayley announced his 
n”? formula for the number of labelled trees with n vertices, but verified it only 
for n=<5. A proot was later given by Prifer (BLW 3D). The much-needed 
breakthrough in enumerative techniques occurred in the 1920s and 1930s (see 
section 3), with the work of Redfield (1927), Lunn and Senior (1929) and Pélya 
(1937). In particular, Pélya enumerative theory was a milestone for the counting 
of graphs and chemical molecules. Further information on the enumeration of 
isomers is given in chapter 38. 

One of the most fundamental results in the study of polyhedra, and conse- 
quently of graphs embedded in the plane, is Euler’s polyhedral formula v — e+ 
f=2, relating the numbers of vertices, edges and faces of a polyhedron or 
connected planar graph. This formula first appeared in a letter on polyhedra 
written by Euler to Goldbach in November 1750 (BLW 5A), but Euler failed to 
produce a valid proof of it. The result is sometimes incorrectly attributed to 
Descartes, who obtained an expression for the sum of the angles of all the faces of 
a polyhedron. Although Euler’s formula can be deduced from this expression, 
there is no evidence that Descartes did so. 

The first correct proof of Euler’s formula involved the metrical properties of 
spherical polygons, and was found by Legendre (1794). A topological (non- 
metrical) proof was given in 1813 by Cauchy (BLW SB), who projected the 
polyhedron onto a plane and used a triangulation method to derive the result for 
planar maps. Meanwhile, Lhuilier (BLW 5C) showed how Euler’s formula leads 
to a proof that there are only five regular polyhedra. He also investigated the 
modifications in the formula if the polyhedron has a hole, if its faces are not 
simply connected, or if the polyhedron is ring-shaped. In this last case, Lhuilier 
obtained the formula 


v-—etf=2-2¢ 


for a graph embedded on a sphere with g handles. 
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This last result was the starting-point for an extensive investigation by Listing. 
In Der Census rdéumlicher Complexe (Listing 1861-2) he studied simplicial 
complexes, thereby laying the groundwork for Poincaré’s development of alge- 
braic topology at the turn of the century. Poincaré showed how complexes can be 
constructed from basic ‘‘cells” — where a 0-cell is simply a vertex, and a 1-cell is an 
edge. In order to fit the cells together, he adapted a technique of Kirchhoff (BLW 
8A), replacing systems of linear equations by the corresponding matrices. These 
ideas were later developed by Veblen in a series of American Mathematical 
Society Colloguium Lectures (BLW 8B). 

It was already known in the 19th century that certain graphs cannot be 
embedded in the plane; for example, MGbius’s problem of the five princes, and 
the gas, water and electricity problem (BLW, pp. 115-116 and 142), show that the 
complete graph K, and the complete bipartite graph K, , are both non-planar. In 
1930 Kuratowski proved that these are the ‘basic’ non-planar graphs, in the 
sense that every non-planar graph must contain a subdivision of at least one of 
them (BLW 8C); a full discussion of the origins of this result is given in Kennedy 
et al. (1985). This idea has more recently been developed by Glover ct al. (1979), 
who obtained a list of 103 “forbidden subgraphs” for graphs embedded in the 
projective plane. Furthermore, Robertson and Seymour (1985) have proved that 
there is a corresponding finite list of forbidden subgraphs for surfaces of any 
genus, although this list may be very large even for surfaces as simple as the torus. 
Further discussion of work in this area can be found in chapters 5 and 10. 

Another aspect of planarity was developed in the 1930s in a series of papers by 
Whitney. In the first of these (Whitney 1931), he showed that the relationship 
between a planar graph and its geometrical dual feads to a combinatorial 
definition of duality which can be used to characterize planar graphs. Extending 
these ideas led him eventually to the idea of a matroid, or abstract independence 
structure, which generalizes ideas of independence in both vector spaces and 
graphs (Whitney 1935). Indeed, the duality of a matroid is a very natural concept 
which extends and clarifies the dual of a planar graph. Interest in matroids was 
slow to develop, but in 1959 Tutte obtained a Kuratowski-type criterion for a 
matroid to arise from a graph (Tutte 1959). His results paved the way for an 
explosion of interest in the 1960s and 1970s, fuelled by the discovery of Edmonds 
and Fulkerson (1965) that the partial transversals of a family of sets give rise to a 
natural matroid structure. For extensive treatments of matroid theory, sce Welsh 
(1976), Oxley (1992) and chapters 9-11 of this Handbook. 

No account of the history of graph theory would be complete without a 
discussion of the four-colour problem. This problem first arose in 1852 when 
Francis Guthrie noticed that only four cofours are needed to colour a map of 
England and wondered whether this is so for all maps. His brother Frederick 
approached Augustus De Morgan, who communicated the problem to other 

_ mathematicians, including Hamilton. It first appeared in print in 1860 in an 
unsigned book review by De Morgan (see Biggs 1983). 

The four-colour problem was revived in 1878 when Cayley asked at a London 

Mathematical Society meeting whether it had been solved. In the following year, 
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Kempe produced his celebrated “proof’’ for. the newly-founded American Journal 
of Mathematics (BLW 6B). This “proof” contained some good ideas and was 
generally accepted until 1890 when Heawood found an error in it. He salvaged 
enough to prove a five-colour theorem, gave a formula for the number of colours 
needed for maps on a sphere with g handles, and justified his formula for maps on 
a torus (BLW 6D, 7A). Unfortunately, although this formula gives the number of 
colours sufficient for colouring a map on a surface, his proof that this number is 
necessary for some maps was deficient. Filling the gap proved a difficult task, 
involving twelve separate cases, but it was eventually completed in 1968. For a 
full account of this work see Ringel (1974). 

Meanwhile, progress on the four-colour problem was also slow and painful. 
Birkhoff (1913) showed that certain configurations in a map are reducible, in the 
sense that a four-colouring of the rest of the map can be extended to a colouring 
of the whole map; this idea of reducibility turned out to be crucial in the eventual 
proof of the theorem. Franklin (1922) used reducibility to prove the theorem for 
maps with at most 25 countries, and this number was increased over forty years to 
95. Finally Appel, Haken and Koch, using ideas of Heesch, proved the four- 
colour theorem in 1976. Although the main idea of the proof can be traced back 
to Kempe, the details were very complicated, involving the analysis of almost 
2000 configurations and the use of hundreds of hours of computing time. An 
account of their search for a proof is given in Appel and Haken (1977); see also 
Appel and Haken (1989). 

Chromatic graph theory has also developed in other directions: for example, 
chromatic polynomials were introduced by Birkhoff (1942-3) and critical praphs 
by Dirac (1952). Other important topics include Brooks’s upper bound for the 
chromatic number of a graph (Brooks 1941), Hadwiger’s conjecture (Hadwiger 
1943) which has the four-colour theorem as a special case, and two papers of 
Vizing (1964, 1965) on edge-colourings of graphs. Further information about 
graph colourings appears in chapter 4. 

The origins of extremal graph theory can be traced back to a question of 
Mantel, solved in 1907 by Wythoff (1906-10), and to a paper of Erdds (1938); 
they determined the maximum number of edges in a graph containing no 
complete graph K, or K,. The modern development of the subject dates from an 
important paper of Turan (1941), which solved the corresponding problem for 
K,,. Probabilistic graph theory was also developed in Hungary, in a series of 
papers by Erdés and Rényi; see, for example, Erdés and Rényi (1960). Their aim 
was to see how the properties of graphs change as edges are added to a graph at 
random, and their influence is still strongly felt in the recent development of the 
subject. For fulf accounts of random graph theory and extremal graph theory, see 
chapters 6 and 23. 

The above account deals exclusively with finite graphs, but some results can be 
extended to infinite graphs (sec chapter 42) by using set-theoretic results of Kénig 
(1927) and Rado (1942). If the graph is countable, with finite vertex-degrees, 
Konig’s lemma (in its graphical form) guarantees the existence of a one-way 
infinite path from any vertex. For uncountable graphs one usually needs a 
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stronger result, known as Rado’s selection theorem, which is one of the most 
important tools in infinite combinatorics. A discussion of these results can be 
found in Thomassen (1983). 


6. Configurations and designs 


The Chinese studies of magic squares (section 1), and Euler’s passing interest in 
orthogonal latin squares (section 2), may be considered as part of the prehistory 
of the modern subject of configurations and designs. But the catalyst for the 
foundation of the theory was geometry, and geometrical ideas pervade the subject 
to this day. 

In 1835, the geometer Plucker remarked that a general plane cubic curve has 
nine points of inflexion, which lie in threes on twelve lines; furthermore, given 
any two of the points, one of the twelve lines passes through both of them. Jn a 
footnote, he remarked (wrongly) that a system S(t) of n points, arranged in 
triples in such a way that any two points belong to just one triple, is possible only 
when 1 =3 (mod 6). In 1839 Pliicker corrected his error, pointing out that both 
n = 1 (mod 6) and n =3 (mod 6) are possible, and he made some remarks about 
other systems of this kind; these references were brought to light by De Vries 
(1984). It seems likely that Pliicker’s comments were noted by Sylvester who 
communicated them to Woolhouse, the editor of a curious English publication 
known as the Lady’s and Gentleman’s Diary. What is certain is that Woolhouse 


proposed the Prize Question for the readers of the Diary for 1844 in the following 
terms: 


“Determine the number of combinations that can be made out of 
symbols, p symbols in each; with this limitation, that no combina- 
tion of g symbols which may appear in any one of them shall be 
repeated in any other.” 


In modern terminology, the question asks for the number of blocks in a q-design 
(usually called a t-design in modern terminology, provided repeated blocks are 
not allowed) with parameters (n, p, 1); Pliicker’s system S() corresponds to the 
case p= 3, q=2. The determination of the number of blocks is simple, but the 
question of the existence of the design is not. For this reason, Kirkman’s paper 
“On a problem in combinations” (1847), read to the Literary and Philosophical 
Society of Manchester on 16 December 1846, is truly remarkable, for it showed in 
effect how to construct the system S(m) whenever n=1 or 3 (mod 6). So, the 
existence problem for S(m) was complctely solved. 

A little later Kirkman noticed that there is a system S(15) with the property 
that its 35 triples can be partitioned into seven scts of 5 triples, in such a way that 
each symbol occurs exactly once in each set of five. Thus was born the famous 


fifteen schoolgirls problem, which apeared as Query VI in the Lady’s and 
Gentleman's Diary for 1850: 


“Fifteen young ladies in a school walk out three abreast for seven 
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days in succession: it is required to arrange them daily so that no 
two shall walk twice abreast.” - 


It is sad that Kirkman’s name should be remembered primarily for this trifle, 
because his mathematical papers entitle him to be regarded as the founding father 
of the theory of designs, rather than as the author of an amusing puzzle. In 
addition to his solution of the existence problem for S(), he constructed 
2-designs with parameters (r? +r+t,r+1, 1), now known as projective planes, 
for every prime value of r, and he used cyclic difference sets to construct 
projective planes with r= 4 and r=8. He also found 3-designs with parameters 
(2", 4, 1) and several other special kinds of design. A fuller discussion of 
Kirkman’s life and work is given by Biggs (1981). 

Kirkman’s work went almost unnoticed at the time. Indeed, some years later 
the geometer Steiner (1853) revived Pliicker’s question in an article in Crelle’s 
Journal. This is why S(t) is usually known as a Steiner triple system, even though 
the major work on it had been published six years before Stciner’s paper. 

The remainder of the 19th century saw much work on variants and extensions 
of the schoolgirls problem. Both Cayley and Sylvester published papers in this 
area, and Cayley coined the name tactic for the general arca of configurations and 
designs. But it was not until the end of the century that the theory of designs was 
once again the subject of a truly significant paper, the ‘Tactical memoranda” of 
Moore (1896). Moore’s work is noteworthy for its systematic treatment of the 
numerical conditions for the existence of designs, and for the use of finite fields to 
construct various familics of designs. 

At the end of the 19th century, a new influx of geometrical ideas began to 
extend and revitalize the subject of ‘tactic’. The idea that geometry can be 
formulated within a system having only a finite number of points may be traced 
back at least to Von Staudt (1856-7). The notion was developed by the ftalian 
geometer Fano (1892), who described finite geometries of various dimensions 
and, in particular, the finite plane with seven points which bears his name. Of 
course, the Fano plane is just the Steiner triple system 5(7), and is defined by the 
axioms for a 2-design with parameters (7, 3, 1), or a projective plane of order 2. 
The geometrical requirements that two points determine a unique line, and (in 
plane projective geometry) that two lines meet in just one point, provide the fink 
between geometry and the theory of designs. 

This link was the stimulus for American geometers to apply Moore’s ‘‘Tactical 
memoranda” to questions of finite geometry. A paper of Veblen and Bussey 
(1906) continued Fano’s work, giving an axiomatic definition of a finite projective 
geometry in any number of dimensions. They showed that the Desargues theorem 
holds in any such geometry, except possibly in the plane case, and that the 
Desarguesian geometries can be coordinatized by the finite Galois fields GF(q). It 
follows that there are Desarguesian projective planes of order q for cach prime 
power q, based on GF(q), but there may also be non-Desarguesian planes. 
Almost immediately afterwards, Veblen and Wedderburn (1907) showed that 
certain skewfields constructed by Dickson can be used to coordinatize non- 


2182 N.L. Biggs et al. 


Desarguesian planes, and they described in some detail a non-Desarguesian plane 
of order 9. 

The existence of finite projective planes is also related to the orthogonal latin 
squares studied by Euler (sce section 2), because a finite projective plane of order 
n gives rise to a set of 2 ~ 1 mutually orthogonal latin squares (MOLS) of order 
n. This result was apparently not stated explicitly until it was noticed in- 
dependently by Bose (1938) and Stevens (1939). Thus Euler’s famous conjecture, 
that there is no pair of orthogonal latin squares of order n whenever n =2 (mod 
4), would imply the weaker result that there are no projective planes with such an 
order. The Euler conjecture for 2-6 was allegedly verified by Clausen around 
1842, but his proof was never published, and the first convincing proof was given 
by Tarry (1900). On the other hand, the work of Moore, Veblen, and their 
colleagues showed that, when # is a prime power q, there are gq ~ 1 MOLS of 
order q. Macneish (1922) remarked that r MOLS of order a and r MOLS of order 
b can be combined to give r MOLS of order ab; his proof is a refarmulation of a 
construction in Moore’s ‘Tactical memoranda’. It follows that, if the prime 


factorization of n is p/'p’2--- p', then there are at Icast 


min(p'', p?....,pry—1 


MOLS of order n. Macneish conjectured that this is the maximum number for any 
n, extending the conjecture of Euler. But Macneish’s attempt to prove the Euler 
conjecture was fallacious, and no further progress was made at that time. 
Although the first known recorded use of a latin square in an experimental 
design was by Cretté de Palluel in 1788 (Street and Street 1988), it was not until 
the 1920s that interest in latin squares and designs was renewed, as a result of the 
work of Fisher and Yates on the design of agricultural experiments. They 
discussed not only the practical applications, but also the theory: for example, 
Fisher and Yates (1934) completely classified the latin squares of order 6, thereby 
verifying independently that no orthogonal pair exists. Yates (1936) discussed 
what are now called balanced incomplete block designs (BIBDs), i.c., 2-designs 
with parameters (vu, k, A) satisfying v > k, and Fisher and Yates published a list of 
known BIBDs in their book (Fisher and Yates 1938). The additional parameters r 


(number of replications) and & (number of blocks) for a BIBD satisfy the 
elementary conditions 


ry = bk, Av ~ 1) =r(k-1), 


but it is clear that these conditions are not sufficient for the existence of a BIBD, 
since, for example, there is no projective plane of order 6 (v = b = 43, k=r=7, 
A=1). Fisher (1940) established the inequality b =v, a surprisingly non-trivial 
constraint. Around the same time Bose (1939) published a long paper giving an 
account of everything then known about the construction of BIBDs, and 
describing several new methods of construction. 

Yates (1936) gave a construction which, in present-day terminology, connects a 
complete set of MOLS with a projective plane. A major step forward was made 
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by Bruck and Ryser (1949), who explicitly considered projective planes as BIBDs 
with v = 6 and A = 1, and they applicd Hasse~Minkowski theory to the incidence 
matrix of the plane. In this way they proved the non-existence of projective 
planes of order n for infinitely many values of n, and in particular when n = 2p, 
where p is a prime with p=3 (mod 4). A little later Chowla and Ryser (1950) 
applied similar methods to the more general case of symmetric designs (uv = b, 
421). They also discussed the relationships between symmetric designs and 
earlicr studics of Hadamard ‘matrices (Paley 1933, Todd 1933), and cyclic 
difference sets (Singer 1938,-Hall. 1947). In general, it is an unsolved problem to 
find a complete sect of necessary and sufficient conditions for the existence of a 
t-design with parameters (v, k, A), although Wilson (1975) has shown that the 
elementary conditions given above are sufficient for a BIBD to exist, provided 
that v is large enough. In the same vein, Ray-Chaudhuri and Wilson (1971) 
proved that, for each 1 =3 (mod 6), there is a Steiner triple system S(n) which 
can be partitioned in the manner of Kirkman’s schoolgirls problem; similar results 
had been obtained independently by Lu by 1965 (see Wu et al. 1990). 

Towards the end of the 1950s, the problem of constructing a pair of MOLS of 
order 10 once again became active. It is likely that the availability of electronic 
computers provided a stimulus for this work, although the first results were 
obtained without the use of computers. The breakthrough came when Parker 
(1959a) found two MOLS of order 21, thereby disproving Macneish’s extension of 
the Euler conjecture. Shortly afterwards, Bose and Shrikhande (1959) con- 
structed two MOLS of order 22, and Parker (1959b) found two MOLS of order 
10. Finally these three authors combined (Bosc ct al. 1960) to show that Euler’s 
conjecture is false for all »=2 (mod 4) and n>6. This achievement was 
front-page-news in the New York Times on 25 April 1959. 

Another problem which has been studied by computational search techniques is 
the question of the existence of a projective plane of order 10. The non-existence 
of a plane of order 6 was proved explicitly by MacInnes (1907); it also follows 
from Tarry’s result on MOLS, and the Bruck—Ryser theorem. The order 10 is not 
excluded by the Bruck~Ryscr theorem, however, and the existence of a pair of 
MOLS of order 10 is inconclusive. After much effort, it was shown that a 
projective plane of order 10 can have no symmetry, so that its construction, if it 
existed, would be complicated. An approach by means of the linear code 
associated with such a plane was suggested by MacWilliams ct al. (1973). As a 
result of extensive computer searches the weight enumerator of the code (which 
would exist if a plane existed) was completely determined by Lam ct al. (1986). 
Finally, at the end of 1988, Lam and his colleagues announced that a projective 
plane of order 10 does not exist (sce Lam et al. 1989 and Lam 1991). 

A recurrent problem in research on designs has been the construction of 
t-designs for large values of ¢. Although Kirkman (1853) had constructed 3- 
designs with parameters (2", 4, 1), it soon became apparent that ¢t-designs with 
t>3 are hard to find (at least in the case when both repeated blocks and all 
k-subsets are forbidden). There are two 4-designs and two 5-designs associated 
with the Mathieu groups; the simplest, a 4-design with parameters (11,5, 1) was 
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given explicitly by Lea (1869). The others were known to Moore (1896) and were 
studied extensively by Carmichael (1931) and Witt (1938), but it was not until the 
1960s that more 4-designs and 5-designs were discovered, by Alltop (1969), 
Assmus and Mattson (1969), and Denniston (1976). The suspicion that t= 5 
might be an absolute limit was dispelled when Magliveras and Leavitt (1984) 
found several 6-designs with parameters (33, 8, 36). Soon afterwards, Teirlinck 
(1987) unfettered the combinatorial imagination by constructing t-designs (incom- 
plete, and without repeated blocks) for all values of ¢, Recent results on designs 
are given by Jungnickel (1989). 


7. Combinatorial set theory 


If m+ 1 objects are distributed among m pigeonholes, then at least one of the 
pigeonholes must receive at least two objects. This elementary fact, often called 
the pigeonhole principle, has been extensively generalized, giving rise in particular 
to that branch of combinatorics now known as Ramsey theory. 

Although the pigeonhole principle is very well known, its origins are obscure. It 
appears in the literature as Dirichlet’s box principle, and it was certainly used by 
Dirichlet in his study of the approximation of irrational numbers by rationals 
(Dirichlet 1879). However, the use of the pigeonhole principle certainly pre-dates 
Dirichlet. For example, Gauss used it in his Disquisitiones Arithmeticae (Gauss 
1801), and it is likely that earlier uses of it occur in the literature. 

The pigeonhole principle can be generalized in various directions. For example, 
if we distribute m(k —1)+1 objects among m pigeonholes, then at least one 
pigeonhole must receive at least k objects. A more profound and far-reaching 
generalization was given by Ramsey (1930). Whilst working on a problem in mathe- 
matical logic, he was led to the problem of distributing the r-subsets of an N-set 
into m classes in such a way that at least one class must contain every r-subset of 
some n-subset; this reduces to the generalized pigeonhole principle if r= 1, n =k, 
and N = m(k — 1) + 1. Ramsey’s main achievement was to prove that, if N is suffi- 
ciently large, then any such distribution has this property. More generally, given 
integers q,,..., q,,, one can prove the existence of anumber R(q,,..-, q,,) such 
that if N=>R,(q,,-.-,4,,) and the r-subsets of an N-set are distributed in any 
manner into classes C,,...,C,,, then some class C, must contain every r-subset of 
some q,-subset; for example, R,(k,...,k) =m(k ~ 1) +1. Ramsey also obtained 
an infinite version which asserts that if one distributes the r-subscts of an infinite set 
S into m classes, then some class must contain every r-subset of some infinite subset 
of S. 

Five years later the subject was given a geometrical flavour by Erdés and Sze- 
keres (1935). They observed that from any five points in general position in the 
plane one can always select four points forming a convex quadrilateral, and they 
‘generalized this by showing that, given enough points to start with, one can similar- 
ly form a convex n-gon. Their first proof of this result invoked Ramsey’s theorem, 
but they also gave a combinatorial proof which depends on the fact that any se- 
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quence of mn + 1 distinct numbers must. contain a decreasing sequence of length 
m + 1 or an increasing sequence of length n + 1; an eight-line proof of this latter re- 
sult was later given by Seidenberg (1959). 

Another well-known special case of Ramsey’s theorem is to show that any 
gathering of six people must contain either three mutual acquaintances or three 
mutual non-acquaintances. More generally, we can ask for the minimum number 
R,(n, m) such that any gathering of R,(z, m) people must contain 2 mutual ac- 
quaintances or m mutual non-acquaintances. This can be expressed graphically by 
letting G be the graph of order R,(n, m) in which two vertices are joined by a red 
edge if the corresponding people are acquainted, and by a blue edge if not; then 
G must contain a red complete graph K, or a blue complete graph K,,. In their 
1935 paper, Erdés and Szekeres obtained the upper bound (2a — 1)!/@n— 1)? 
for R,(n,m). Later, Erdés (1947) proved that R,(n,n)22"? and R,(3,n)< 
$n(n + 1). It is remarkable that these results have hardly been improved since then. 

Two classical results which are also related to Ramsey’s theorem are Schur’s 
lemma and Van der Waerden’s theorem. Schur (1916) proved that, if the integers 
1,2,..., MN are distributed into m classes, where N > mile, then we can always 
find three integers x, y, z in some classes such that x — y =z. Analogues of this 
result for other linear equations, and for systems of finear equations, were later 
obtained by Rado (1943). An example of this is given by the system 


My Xy HXy Ky Sr Hy Hy 


If the x, are unequal, then a solution of these equations consists of & integers in 
an arithmetic progression, and we deduce the result of Van der Waerden (1927) 
that if the integers 1,2,...,N are distributed into m classes, where N is 
sufficiently large compared with m, then at least one class must contain. an 
arithmetic progression of any given size. 

In recent years there has been an explosion of interest in Ramsey-type results, 
mainly through the influence of Erdés and his followers. An excellent source of 
recent results in Ramsey theory is Graham et al. (1980); see also chapter 25. 

Another of the most influential results in combinatorial set theory is Hall’s 
theorem (P. Hali 1935), sometimes called the marriage theorem (Halmos and 
Vaughan 1950), which gives a necessary and sufficient condition for a family 
F=(A,,...,A,,) of subsets of a set § to have a transversal (or system of distinct 
representatives); that is, a set of m distinct elements of S, one chosen from each of 
the sets A,. Hall’s theorem states that F has a transversal if and only if 


U rH 2T 
(er 
for any subset T of {1,2,...,m}. If S has a matroid structure defined on it, then 
there is a corresponding condition, due to Rado (1942), for the existence of a 
transversal which is independent in the matroid. Later, M. Hall (1945) used the 
idea of a transversal to extend Jatin rectangles to Jatin squares. 

Although Hall’s Theorem is fundamental, other results related to it had been 
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proved a few years earlier. The first of these was due to Frobenius (1912), and 
concerned the reducibility of the determinant of a matrix. A shorter proof of 
Frobenius’s result, but in the language of bipartite graphs, was given by Konig 
(1915). In the following year, Konig (1916) gave a necessary and sufficient 
condition for a bipartite graph to contain a perfect matching (1-factor), this result 
was later extended by Tutte (1947), who gave a corresponding condition for an 
arbitrary graph to have a I-factor. In 1917, Frobenius gave another proof of his 
theorem, using a lemma which is equivalent to Hall’s theorem (Frobenius 1917). 
An excellent survey of these early results appears in Lovasz and Plummer (1986). 
Matchings are discussed in chapter 3. 

The above results have given rise to a large number of minimax theorems in 
combinatorics, in which the minimum of one quantity equals the maximum of 
another. Celebrated amongst these are Menger’s theorem (Menger 1927), which 
states that the minimum number of vertices separating two given vertices in a 
graph is equal to the maximum number of vertex-disjoint paths between them, 
and Kénig’s minimax theorem (K6nig 1931), that the size of a largest matching in 
a bipartite graph is equal to the smallest set of vertices which together touch every 
edge. Later, Ford and Fulkerson (1956), and independently Elias et al. (1956), 
proved the celebrated max-flow min-cut theorem for capacitated networks, which 
states that the maximum flow between two vertices is equal to the minimum 
capacity of a cut separating them. In another direction is Dilworth’s theorem for 
partially ordered sets (Dilworth 1950), that the minimum number of chains 
(totally ordered sets) which cover a partially ordered set is equal to the maximum 
size of an antichain (set of incomparable elements). All such minimax results are 
related to the duality theorem for linear programming, and surveys of them can be 
found in Woodall (1978) and Schrijver (1983). 

Another classical result of set theory is Sperner’s lemma (Sperner 1928). This 
states that, if S is an n-set and F is a family of subsets of S none of which contains 
another (a Sperner family), then F contains at most 


(1a) 


sets. This result has been extended by Lubell (1966) and others, who asserted that 
if (A,,...,A,,) is a Sperner family, then 


E (i) = 


A related result is the above theorem of Dilworth, since if P is the lattice of 
subsets of S, then an antichain in P is a Sperner family of S, and we can thus 
partition P into 


diay) 


chains. Also related is a result of Kleitman (1970), generalizing a problem of 
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Littlewood and Offord (1943), that if x,,...,x,, are vectors in R” with length at 
least 1, then there are at most 


(a 


sums x, + ---+.x, which differ in norm by less than 1. 

Instead of considering Sperner families, we can study intersecting families, in 
which any two sets have non-empty intersection. It is easy to see that if S is an 
n-set and r >4n, then any intersecting family of r-subsets of S has at most 2”! 
subsets. The problem is more difficult if r=}, but is answered by the Erdés— 
Ko-Rado theorem (Erdos et al. 1961) which asserts that the maximum number of 
subsets is 


cal 
{eee ad 
and that this number is attained only when all the subsets contain a common 
clement of S. The Erd6s~Ko-—Rado theorem has proved to be a milestone in 
extremal set theory. 

The following problem was solved by Kruskal (1963), and independently by 
Katona: if F is any family of r-subsets of a finite §, what is the least number of 
(r — 1)-subsets contained in some set in F? Further details of the result (now 


known as the Kruskal—Katona theorem), and of many other results in com- 
binatorial set theory, can be found in Bollobds (1986). 


8. Algorithmic combinatorics 


Graph algorithms go back at least as far as the 1880s, when Fleury gave a method 
for tracing an Eulerian trail in a graph, and Trémaux and Tarry both showed how 
to traverse a maze (BLW 1D). It is in the 20th century that graph algorithms have 
come into their own, with the solutions of such problems as the shortest path 
problem, the minimum spanning-tree problem, and the Chinese postman prob- 
lem. The greedy algorithm for finding a minimum-length spanning trec is often 
attributed to Kruskal (1956), but had been obtained some years earlier by 
Bortivka (1926). There are several algorithms for finding the shortest path in a 
network, of which the best known is due to Dijkstra (1959). Finding a longest 
path, or critical path, in an activity network dates from around the same time, 
with PERT (Program Evaluation and Review Technique) designed in the mid- 
1950s for a problem involving submarines. The Chinese postman problem, for 
finding the shortest route covering each edge of a graph, was solved by Guan 
(= a 1960). Some of these, and other graph algorithms, are discussed in 
chapters 28 and 35. 

A related problem is the travelling salesman problem, in which a saicsman has 
to make a cyclic tour of a number of cities in minimum time or distance. A 
rudimentary statement of the problem appeared in a practical German book 
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written for the Handlungsreisende (Voigt 1831), but its first appearance in 
mathematical circles took place in the early 1930s at Princeton. Its main publicist 
was Flood, who later popularized the problem at the RAND Corporation. This 
led eventually to the fundamental paper of Dantzig et al. (1954) involving the 
solution of a travelling salesman problem with 49 cities. Over the years the 
number of citics considered has been gradually increased and in 1986 a problem 
with 2392 cities was settled by Padberg and Rinaldi (1987). An extensive 
treatment of the travelling salesman problem appcars in Lawler et al. (1985). 

A substantial advance in the understanding of combinatorial algorithms was the 
classification of combinatorial problems as “easy” or “hard”. By the late 1960s it 
was already clear that problems such as the travelling salesman problem seemed 
to be much more difficult than, for example, the minimum spanning-tree 
problem. Edmonds (1965) had already described an algorithm as “good”’ if a 
polynomial-time algorithm exists, and in three fundamental papers, Cook (1971), 
Karp (1972) and Levin (1973) developed the concept of NP-completeness. In this 
theory the assignment, transportation and minimum spanning-tree problems are 
all in the polynomial class P, whereas the travelling salesman and Hamiltonian 
cycle problems are NP-hard. The concept of computational complexity is 
discussed in chapter 29. 

The travelling salesman problem was not the only significant combinatorial 
problem studied at the RAND Corporation in the 1940s and 1950s. During this 
time techniques were developed, by Dantzig and Fulkerson (1954) for finding the 
least number of tankers needed to meet a fixed schedule, by Ford and Fulkerson 
(1956) for finding the maximum flow in a capacitated network, and by Gomory 
and Hu (1961) for investigating multi-terminal and multi-commodity flows. These 
investigations led eventually to the subject of polyhedral combinatorics, as 
described in detail in chapters 28 and 30. 

Before joining the RAND Corporation in 1952, Dantzig had instigated the 
study of linear programming techniques. The basic ideas of linear programming 
can be traced back to Fourier (1826); an account of Fourier’s study of systems of 
linear inequalities, with some historical remarks, is given by Williams (1986). 
Later, in the 1930s, Kantorovitch considered lincar programming as a mathemati- 
cal study in its own right, but his work remained unnoticed for many years. In the 
1940s, motivated by Second World War planning activities, Dantzig and von 
Neumann independently discovered and developed the idea of linear program- 
ming. Dantzig proposed the basic theory in 1947-8 (Dantzig 1949), and the 
fundamental concept of duality was introduced by von Neumann in 1947. Later, 
Dantzig introduced the highly efficient and practical simplex method for solving 
linear programming problems (Dantzig 1951). Further information about the 
origins of linear programming problems and their connections with matching 
theory can be found in Lovasz and Plummer (1986) and in a historical article by 
Dantzig (1982). Another good source which includes historical material is the 
book by Schrijver (1986). 

Linear programming techniques proved to be ideally suited to the solution of 
certain practical problems which had arisen during the war years. In particular, 
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the fundamental diet problem of selecting a dict with maximum nutritional value 
was discussed by Stiegler (1945), and the transportation problem of shipping a 
commodity at minimum cost from several sources to several markets was 
investigated by Hitchcock (1941), although it had been studied geometrically 160 
years carlier by Monge (1784). 

Related to the above topics is the earlier creation of the theory of two-person 
games by von Neumann (1928). This paper contained the fundamental minimax 
theorem for games, although the proof there is involved. Simpler proofs, and an 
extensive treatment of the subject in general, were later given in the pioneering 
books of von Neumann and Morgenstern (1944) and McKinsey (1952). 

Since the 1940s, the topics mentioned very briefly in this section have grown in 
importance, and now play a central role in modern combinatorics. Further 
information about most of these subjects can be found elsewhere in this 
Handbook, in particular in chapters 28, 30 and 35. 
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